By: LIC. EZEQUIEL MORFI | TITANIO
Much has been said about DITHERING and the purposes of its implementation. While some of the statements are indeed true, often times the analogies or other means of illustration for what really goes on behind the scenes end up forming the actual definition and conception that engineers will later bear in their minds. Other times, the claims are utter fallacies.
DITHER randomizes the LSB (least-significant bits) before truncation occurs in order to prevent truncation-distortion from happening in the first place. Truncation-distortion can be defined as a rather unpleasant, odd-order harmonic-distortion that adds an undesirable quality to the spectrum.
The LSB to be randomized are usually the last two bits from the imminent, final (post-truncation) word length, although this, in fact, varies depending on the developer’s design. So, for example, when converting from a 24-bit fixed- point word to a 16-bit fixed-point word, a dithering algorithm will randomize the values for bits 15 and 16, and null (value = 0) bits from 17 to 24.
If used as a plug-in, the dither would still have to output a word-length according to the DAW’s internal buss; it could be 32-bit or 64-bit floating-point as per current standards, but such 32 or 64-bit floating-point audio stream would actually only carry 16-bit fixed-point information inside, as can be measured by a bit meter. Later on, when the DAW is forced to output (i.e. render, bounce, mixdown, export, etc.) a new, 16-bit fixed-point file, the word-length of the audio stream is indeed truncated, meaning that bits 17 to 24 get amputated, removed and discarded. However, since dither was previously applied, no truncation-distortion is produced by this action, as there can be no harmonic-related distortion to the random noise that generates from the randomization of the LSB. No undesirable partials from the truncation-distortion appear in the new 16-bit signal at the cost of a small noise floor that is the product of the randomization of the LSB.
Besides the fact that it imparts an inharmonic timbre to the whole material, the level of the truncation-distortion partials can also be so loud as to mask the lowest-level sounds in the original signal. Therefore, dither is said to actually improve the dynamic range, even while raising the noise floor at the same time, since without the truncation-distortion, low-level sounds can now be better distinguished.
So why not randomize only one bit? Why can’t the LSB be only the last digit? The reason for this is because in such a case, while indeed effectively preventing truncation-distortion from happening, the random noise generated by the dither would also become modulated by the signal’s own amplitude; i.e. the noise floor would rise along with the signal’s level, increasing and decreasing according to each sample value at any given point. So, it is to eliminate this potential secondary effect that it is normally advised that two bits be randomized instead of just a single one; such additional randomness, although undoubtedly the cause for an even louder noise floor, completely prevents both of the unwanted behaviors (the truncation distortion and a modulated floor noise) from taking place as a consequence of the word-length reduction. Hence the existence of the gentler and the more aggressive dithering options in the market; dithers that would randomize more or less of the LSB in order to archive a lower or a higher noise floor at the possible risk of over-modulating it. Again, this is a general rule for most of the applications using a TPDF dither but not necessarily the case for all of them. It all comes down to the developer’s choice in the end.
“In other words, dither eliminates the certainty in digital audio, as this certainty is flawed from the beginning by the quantization error”
NOISE SHAPING will shift the previously-applied dither noise onto the extreme high-end of the spectrum. A moderate noise shaping is advised, while more extreme settings could introduce a risky amount of high-frequency content near the audible range and possibly alter the program material in an undesirable way or be potentially harmful to the tweeters.
Also, care must be taken to avoid further processing of the signal after it has undergone noise shaping, especially with non-linear processes that would induce harmonic-distortion to the new high-frequency content generated by the noise-shaping. This has of course great relevance inside the studio, where the operator must be careful not to post-process program material that has been previously dithered and noise-shaped, but is in reality of utmost importance in the context of the current common mass-consumption media and music platforms (mp3 files, Spotify®, YouTube® and other digital audio streaming services) which employ “undithered” digital volume controls, digital equalization at the sound output, loudness-normalizing gain algorithms, trim controls, built-in limiters or other DSP processes that will re-quantize the audio content and that are likely to cause issues in the final playback from the great level of energy in the higher frequencies that was previously introduced by the noise-shaping process.
“Our problem today is that we are no longer at the end of the digital signal processing chain like we were twenty years ago”Bob Ohlsson
Although noise shaping can be applied to a signal as a separated process, it will most definitely not prevent truncation-distortion from happening after word-length reduction. The reason a noise-shaping algorithm alone (with no dither) is offered for application inside many software workstations or plug-ins is for the operator to implement just in case a signal has been previously dithered albeit without noise-shaping. In such case the technique of noise shaping would operate normally offering predictable results.
AUTO-BLANKING (sometimes referred to as ‘auto-blacking’) is used to mute the dither’s output as the signal approaches the LSB in a gate-like fashion, although governed by a bit-by-bit analysis instead of time constants. When the dither application detects digital silence, it will mute itself in order to prevent the (unnecessary) random noise, the product from the randomization of the LSB, to be passed on to the output. This can be sometimes useful to improve the sense of stereo image but can also result undesirable for causing “sputtering” or “blinking” effects in the stereo field, so great care must be taken when trying out auto-blanking.
EZEQUIEL MORFI | TITANIO – firstname.lastname@example.org