Question

I implemented the following algorithm to convert PCM 16 bit audio data to 8 bit:

if(encoding == AudioFormat.ENCODING_PCM_8BIT){
    int len = data.length;          
    data1 = new byte[len/2];
    int tempint;

    for (int k = 0, i=1; i < len; i+=2, k++) {
        tempint = ((int)data[i]) ^ 0x00000080; 
        data1[k] = (byte)tempint;
    }
    data=null;
}

where data is byte[]. After running this code, the output contains a lot of noise and suggest me that I'm doing something wrong here. What should I do besides dropping the lower byte?

[EDIT]: modified the code:

if(encoding == AudioFormat.ENCODING_PCM_8BIT){

            int len = data.length;          
            data1 = new byte[len/2];
            for (int i = 0; i < len/2; i++) {                   
                    data1[i] = data[i*2+1];     
            }

  }

the input/output looks like:

 Original data(counter:0) = 4
    Original data(counter:1) = -1
    Original data(counter:2) = 75
    Original data(counter:3) = -1
    Original data(counter:4) = 16
    Original data(counter:5) = -1
    Original data(counter:6) = 44
    Original data(counter:7) = -1
    Original data(counter:8) = 7
    Original data(counter:9) = -1
    Original data(counter:10) = 22
    Original data(counter:11) = -1
    Original data(counter:12) = 22
    Original data(counter:13) = -1
    Original data(counter:14) = 12
    Original data(counter:15) = -1

Output data:(counter:0) = -1
Output data:(counter:1) = -1
Output data:(counter:2) = -1
Output data:(counter:3) = -1
Output data:(counter:4) = -1
Output data:(counter:5) = -1
Output data:(counter:6) = -1
Output data:(counter:7) = -1
Output data:(counter:8) = -1
Output data:(counter:9) = -1
Output data:(counter:10) = -1
Output data:(counter:11) = -1
Output data:(counter:12) = -1
Output data:(counter:13) = -1
Output data:(counter:14) = -1
Output data:(counter:15) = -1

It does not matter if I drop first or second byte, the noise still remain. Here I dropped first byte(instead of second)

Was it helpful?

Solution

The following algorithm has considerably reduced the amount of noise, but can't get rid of it completely:

if(encoding == AudioFormat.ENCODING_PCM_8BIT){  
            ShortBuffer intBuf = ByteBuffer.wrap(data).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer();
            short[] samples16Bit = new short[intBuf.remaining()];
            intBuf.get(samples16Bit);
            data1 = new byte[samples16Bit.length];
            for (int i = 0; i < samples16Bit.length; i++) {
                data1[i] = (byte)((samples16Bit[i] / 256)+128);
            }
        }

OTHER TIPS

The noise that you are experiencing is simply caused by the bitrange that you have converted your audio to. The noise floor for 16Bit signals is -96dB yet the noise floor for 8Bit signals is -48dB. This may not seem like much in terms of these numbers but it is a huge difference. Downsampling algorithms almost always employ some kind of dithering to reduce the amount of noise associated with the conversion. You can demonstrate the differences in quality (and noise level) quite easily by creating a sine wave in 8Bit programmatically or with any decent audio program and just listening to the result. You will find that 8Bit is not really quality. Repeat the experiment with a 16Bit sine wave to compare. It's not you, it's the bitrange.

Why not just this?

int len = data.length;          
data1 = new byte[len/2];
for (int i=0; i < len/2; ++i)
    data1[i] = data[i*2];

I'm assuming your data is bigendian. If it's LE, this should work:

int len = data.length;          
data1 = new byte[len/2];
for (int i=0; i < len/2; ++i)
    data1[i] = data[i*2+1];

You have a lot of noise for multiple reasons. You are firstly only filling every other value of the array, the values you are not filling are automatically zero, that distorts the waveform hugely. Secondly you are just picking the first 8 bits of the original data, this means you lose all of the information on a specific datapoint if the first 8 bits of the datapoint happen to be all zeros, the information may be in the higher bits.

A naive suggestion would be to scale all of the datapoints (divide by 2^7 if signed) so that the highest datapoint is at most 8 bit, you will still lose information and introduce distortion because you are forced to save the data in integers and integer division will force values in the same (close) range to be equal after division but this should be a lot less noisy :)

Thanks to the comment below, if you are only taking every other datapoint from the original you will introduce a distortion known as Aliasing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top