Understanding the process of recording audio in digital format is essential for every modern audio engineer. It helps not only to improve the procedures and quality of the audio but also gives control over the whole recording process. Today we are going to look at two basic but very important concepts in the digital audio world: sample rate and bit depth.

Although these two terms have been mixed up quite often in the past, they relate to two different concepts in audio. 

Sample Rate and Time

When we record audio in digital format we transform the continuous analogue signal into discrete values. This process is called sampling. If we represent a waveform in a graph we could do it by drawing a continuous line or by marking points that resemble that shape of the waveform. The more points we draw the closer we get to the continuous line. That’s how sampling works. Each of those points in the graph is a sample that the computer collects. The higher the number of samples, the closer it gets to the true analogue signal — but, how many samples should the computer collect and how often? This is where sample rate comes into play. 

Sample rate is a value that defines how many times the computer will sample the analogue signal per second. It’s measured in Hertz and it is key to get an accurate digital recording of an audio signal. 

Nyquist Theorem

Talking about sampling rate won’t be complete without mentioning the Nyquist theorem. 

The Nyquist Theorem takes its name from Harry Nyquist, a Swedish physicist that worked as a researcher in Bell’s lab in the first half of the 20th century. The theorem is a study in signal processing and it defines the minimum number of samples needed to accurately reconstruct a continuous-time signal, like sound. This value needs to be at least twice the highest frequency in Hertz of the signal to be sampled. Since the human hearing goes from 20 Hz to 20 kHz, based on this theorem the minimum number of samples to accurately represent this range of sounds will be at least 40 kHz. In other words, the computer will collect 40,000 samples every second to capture the analog signal.

The Sample Rate needs to be at least twice the highest frequency to be sampled.

Any sampling rate below this value (Nyquist rate) will create a type of distortion called aliasing, where certain frequencies will be misinterpreted during playback due to the lack of enough information. This is especially noticeable in higher frequencies of the audio spectrum. To avoid this type of distortion, filters are applied during signal digitization to remove any “confusing” information that can cause artifacts. You can learn more about the importance of filters and aliasing in previous articles in our blog.

In the 1970’s, 44.1 kHz sampling rate became the audio digitization standard. By increasing the sampling rate from 40 kHz to 44.1 kHz it allowed for a smoother filtering of the audio without losing any audible information. 

Bit Depth and Amplitude

Now that we know how many samples per second need to be recorded in order to capture the audible frequency spectrum, it is time to look at the other component of sound: the amplitude.

Computer information is stored in binary language: every piece of information inside your hard drive is a combination of zeros and ones. Each of these digits are called bits; one single bit can store 2 states or values since it can be either 0 or 1. By increasing the number of bits, the stored data increases exponentially: 2 bits store 4 values, 3 bits can store 8, 4 bits store 16 and so on. To calculate the number of values one just has to raise 2 to the power of the number of bits (2bits). The complexity of the file determines how many bits are needed in order to store its information. 

Digital audio uses pulse-code modulation (PCM) to encode the dynamic range of our signal. PCM is a standard method to capture the amplitude at each sampled point. It assigns an amount of bits to each sample providing a range of discrete values to define their amplitude or audio resolution. This resolution is what we call bit depth. The higher the bit depth the higher the resolution we have, getting more points in between the lowest value and the highest value.

But, wait a minute. The amplitude of a signal is not discrete. It can have an infinite number of values. How can it be represented in a discrete way? Easy. By rounding up. 

PCM accommodates each analogue value to its nearest discrete one in the digital domain in a process called quantization. 

It is easy to see that this process introduces errors during the analog-to-digital conversion. By increasing the bit depth it not only increases the number of possible discrete values but more importantly it increases the separation between the noise floor and the digitized signal (Signal-to-Noise Ratio). 

CD quality audio has a bit depth of 16 bit providing 65,536 discrete values and 96.3 dB of Signal-to-Noise Ratio, while most professional recording equipment records at 24 bit, providing a resolution of 16,777,216 values and 144,5 dB of SNR. 

Modern DAWs use 32 bit floating point for processing audio internally. This type of bit depth is able to bring the quantization points closer to the actual value of the signal and provides more headroom, which is great for mixing. But on the other hand, in floating point the quantization error is less uniform compared to the fixed point and the noise floor can increase with the signal. That’s why it’s rarely used during recording.

The Perfect Pair

As we have seen, sample rate and bit depth are the cornerstone of audio digitization. The debate comes when talking about the perfect combination for getting a good audio quality.

As we mentioned before, manufacturers decided to standardize the CD audio quality to 44.1 kHz at 16 bit and it has been proven good enough for the average listener for decades.

Still, in a studio environment it is good practice to set the sampling rate and bit depth in our DAW to a higher value to increase the audio quality. This will allow downsample afterwards if needed or even deliver a higher quality mix, which is recommended by some streaming services. 48 kHz at 24 bit it’s a good compromise since it is also the audio quality required for video editing. But It’s not uncommon to see recording studios recording at 96 kHz or even 192 kHz 24 bit. The problem in these cases is the resulting file size which can end up being very difficult to manage. In the end finding a balance of audio quality and storage available will determine the best sampling rate on each scenario.

What sample rate and bit depth do you record your session at? Can you hear the difference between a 48 kHz and a 96 kHz recording? Let us know in the comments below and don’t forget to subscribe to Sonimus’ Newsletter to get the latest news from our blog.

Written By Carlos Bricio