Question

I'm interested in precisely extracting portions of a PCM WAV file, down to the sample level. Most audio modules seem to rely on platform-specific audio libraries. I want to make this cross platform and speed is not an issue, are there any native python audio modules that can do this?

If not, I'll have to interpret the PCM binary. While I'm sure I can dig up the PCM specs fairly easily, and raw formats are easy enough to walk, I've never actually dealt with binary data in Python before. Are there any good resources that explain how to do this? Specifically relating to audio would just be icing.

Was it helpful?

Solution

I read the question and the answers and I feel that I must be missing something completely obvious, because nobody mentioned the following two modules:

  • audioop: manipulate raw audio data
  • wave: read and write WAV files

Perhaps I come from a parallel universe and Guido's time machine is actually a space-time machine :)

Should you need example code, feel free to ask.

PS Assuming 48kHz sampling rate, a video frame at 24/1.001==23.976023976… fps is 2002 audio samples long, and at 25fps it's 1920 audio samples long.

OTHER TIPS

I've only written a PCM reader in C++ and Java, but the format itself is fairly simple. A decent description can be found here: http://ccrma.stanford.edu/courses/422/projects/WaveFormat/

Past that you should be able to just read it in (binary file reading, http://www.johnny-lin.com/cdat_tips/tips_fileio/bin_array.html) and just deal with the resulting array. You may need to use some bit shifting to get the alignments correct (https://docs.python.org/reference/expressions.html#shifting-operations) but depending on how you read it in, you might not need to.

All of that said, I'd still lean towards David's approach.

Is it really important that your solution be pure Python, or would you accept something that can work with native audio libraries on various platforms (so it's effectively cross-platform)? There are several examples of the latter at http://wiki.python.org/moin/PythonInMusic

Seems like a combination of open(..., "rb"), struct module, and some details about the wav/riff file format (probably better reference out there) will do the job.

Just curious, what do you intend on doing with the raw sample data?

I was looking this up and I found this: http://www.swharden.com/blog/2009-06-19-reading-pcm-audio-with-python/ It requires Numpy (and matplotlib if you want to graph it)

import numpy
data = numpy.memmap("test.pcm", dtype='h', mode='r')
print "VALUES:",data

Check out the original author's site for more details.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top