Reducing sample bit-depth by truncating

https://stackoverflow.com/questions/4022838

26-09-2019
|

Question

I have to reduce the bit-depth of a digital audio signal from 24 to 16 bit.

Taking only the 16 most significant bits (i.e. truncating) of each sample is equivalent to doing a proportional calculation (out = in * 0xFFFF / 0xFFFFFF)?

Solution

I assume you mean (in * 0xFFFF) / 0xFFFFFF, in which case, yes.

OTHER TIPS

You'll get better sounding results by adding a carefully crafted noise signal to the original signal, just below the truncating threshold, before truncating (a.k.a. dithering).

Dithering by adding noise will in general give you better results. The key to this is the shape of the noise. The popula pow-r dithering algorithms have a specific shape that is very popular in a lot of digital audio workstation applications (Cakewalk's SONAR, Logic, etc).

If you don't need the full on fidelity of pow-r, you can simply generate some noise at fairly low amplitude and mix it into your signal. You'll find this masks some of the quantization effects.

x * 0xffff / 0xffffff is overly of pedantic, but not in a good way if your samples are signed -- and probably not in a good way in general.

Yes, you want the maximum value in your source range to match the maximum value in your destination range, but the values used there are only for unsigned ranges, and the distribution of quantisation steps means that it'll be very rare that you use the largest possible output value.

If the samples are signed then the peak positive values would be 0x7fff and 0x7fffff, while the peak negative values would be -0x8000 and -0x800000. Your first problem is deciding whether +1 is equal to 0x7fff, or -1 is equal to -0x8000. If you choose the latter then it's a simple shift operation. If you try to have both then zero stops being zero.

After that you have a problem that division rounds towards zero. This means that too many values get rounded to zero compared with other values. This causes distortion.

If you want to scale according to the peak positive values, the correct form would be:

out = rint((float)in * 0x7fff / 0x7fffff);

If you fish around a bit you can probably find an efficient way to do that with integer arithmetic and no division.

This form should correctly round to the nearest available output value for any given input, and it should map the largest possible input value to the largest possible output value, but it's going to have an ugly distribution of quantisation steps scattered throughout the range.

Most people prefer:

out = (in + 128) >> 8;
if (out > 0x7fff) out = 0x7fff;

This form makes things the tiniest bit louder, to the point that positive values may clip slightly, but the quantisation steps are distributed evenly.

You add 128 because right-shift rounds towards negative infinity. The average quantisation error is -128 and you add 128 to correct this to keep 0 at precisely 0. The test for overflow is necessary because an input value of 0x7fffff would otherwise give a result of 0x8000, and when you store this in a 16-bit word it would wrap around giving a peak negative value.

C pedants can poke holes in the assumptions about right-shift and division behaviour, but I'm overlooking those for clarity.

However, as others have pointed out you generally shouldn't reduce the bit depth of audio without dithering, and ideally noise shaping. TPDF dither is as follows:

out = (in + (rand() & 255) - (rand() & 255)) >> 8;
if (out < -0x8000) out = -0x8000;
if (out > 0x7fff) out = 0x7fff;

Again, big issues with the usage of rand() which I'm going to overlook for clarity.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow