Question

I want to record sound (voice) using PortAudio (PyAudio) and output the corresponding sound wave on the screen. Hopeless as I am, I am unable to extract the frequency information from the audio stream so that I can draw it in Hz/time form.


Here's an example code snippet that records and plays recorded audio for five seconds, in case it helps any:

p = pyaudio.PyAudio()

chunk = 1024
seconds = 5

stream = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=44100,
                input=True,
                output=True)

for i in range(0, 44100 / chunk * seconds):
    data = stream.read(chunk)
    stream.write(data, chunk)

I wish to extract the needed information from the above variable "data". (Or use some other high-level approach with PortAudio or another library with Python bindings.)


I'd be very grateful for any help! Even vaguely related tidbits of audio-analyzing wisdom are appreciated. :)

Was it helpful?

Solution

What you want is probably the Fourier transform of the audio data. There is several packages that can calculate that for you. scipy and numpy is two of them. It is often named "Fast Fourier Transform" (FFT), but that is just the name of the algorithm.

Here is an example of it's usage: https://svn.enthought.com/enthought/browser/Chaco/trunk/examples/advanced/spectrum.py

OTHER TIPS

The Fourier Transform will not help you a lot if you want the analysis to be conducted in both the frequency and time domain. You might want to have a look at "Wavelet Transforms". There is a package called pywavelets... http://www.pybytes.com/pywavelets/#discrete-wavelet-transform-dwt

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top