How do wave files store multiple channels?

https://stackoverflow.com/questions/2977171

24-10-2019
|

Question

I've created two wave files using Audacity. Both have 44100hz sample rate, 32-bit float samples, were saved as WAV (Microsoft) 16-bit signed and contain 1s of silence (according to Audacity). The difference is that one file contains one channel, while the other have two (stereo). When reading the one channel file I got frames like this:

0x00 0x00  
...  ...

Just as expected, but when reading the second file I got:

0x00 0x00 0x00 0x00  
0x01 0x00 0xff 0xff  
0x00 0x00 0x00 0x00  
0x00 0x00 0x01 0x00  
0xff 0xff 0x01 0x00  
0xfe 0xff 0x03 0x00

This seems to be a random pattern to me. It has something to do with the way channels are stored within the wave file? Shouldn't it be something like:

0x00 0x00 0x00 0x00  
...  ...  ...  ...

PS: I have used python builtin module 'wave' to read the files.

Solution

The very low level signal where silence was expected, may have been caused by dither used in the conversion from 32-bit to 16-bit.

OTHER TIPS

The data is not random

Looking at it i seem to see 2 int values per line, each 2 bytes in little-endian:

0x00 0x00 0x00 0x00  
0x01 0x00 0xff 0xff  
0x00 0x00 0x00 0x00  
0x00 0x00 0x01 0x00  
0xff 0xff 0x01 0x00  
0xfe 0xff 0x03 0x00

Decodes as:

So you see those very close to 0 numbers (nigh silence), seems as jitter, as others suggested.

From what I remember the channels should be alternating, so 1 second of 44.1 khz will be a stream of 88,200 k samples, alternating left and right or whatever the spec says.

Also Audacity should not get float -> int conversion wrong, only the other way around. Try to start out with integer samples instead of flotatng point maybe. Or have one channel at a known value (ie Ox8f8f) and the other 0, that might be easier to figure out.

Deleted Code and prev post.

Silence: "Real" silence must be zero. Otherwise it is often called "room" silence, a very small noise which is present everywhere if you dont use a noise gate. (recording) Its just an idea: remember that using signed values will cause 1 bit to be used for the signed/unsigned marker. Maybe ( i dont know) this is what you see after converting it to a signed wave file using audacity. Im sorry but i dont have the time to test this.

Wave files: I don't know how much you know about soundfiles, but: If you just want to add silence try it this way: Each sample is of size X bits: so you need X/8 bytes for one sample. You know the sampling rate-so you can just copy the original raw byte array into one of size (silence_length_in_samplesbytes_per_frame)+(original)+(silence_length_in_samplesbytes_per_frame) and just write it back into a soundfile by using the python tools which i hope can do this.

2 Channels: The raw bytes are organized in: [sample1(channel1_bytes, channel2bytes)][sample2(channel1_bytes,channel2_bytes).... I hope it is clear what I mean :)

You can see what those numbers are with this code:

import struct
struct.unpack("f", struct.pack("I", 0xfeff0300))
(-1.6948435790786458e+38,)

They all appear to be very small, arguably silent, numbers. I generated silence and saved it as a 32-bit floating point WAV and did not get small numbers. My file contained zeros, excluding the header.

0.2 seconds of silent, 2 channel floating point data can be generated like so:

import array
silence = array.array("f", [0] * int(44100 * 2 * 0.2))

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow