Question

Audio processing is pretty new for me. And currently using Python Numpy for processing wave files. After calculating FFT matrix I am getting noisy power values for non-existent frequencies. I am interested in visualizing the data and accuracy is not a high priority. Is there a safe way to calculate the clipping value to remove these values, or should I use all FFT matrices for each sample set to come up with an average number ?

regards

Edit:

    from numpy import *
    import wave
    import pymedia.audio.sound as sound
    import time, struct
    from pylab import ion, plot, draw, show

    fp = wave.open("500-200f.wav", "rb")
    sample_rate = fp.getframerate()
    total_num_samps = fp.getnframes()
    fft_length = 2048.
    num_fft = (total_num_samps / fft_length ) - 2
    temp = zeros((num_fft,fft_length), float)

    for i in range(num_fft):
        tempb = fp.readframes(fft_length);
        data = struct.unpack("%dH"%(fft_length), tempb)
        temp[i,:] = array(data, short) 
    pts = fft_length/2+1
    data = (abs(fft.rfft(temp, fft_length)) / (pts))[:pts]

    x_axis = arange(pts)*sample_rate*.5/pts
    spec_range = pts
    plot(x_axis, data[0])
    show()

Here is the plot in non-logarithmic scale, for synthetic wave file containing 500hz(fading out) + 200hz sine wave created using Goldwave.

Was it helpful?

Solution

Simulated waveforms shouldn't show FFTs like your figure, so something is very wrong, and probably not with the FFT, but with the input waveform. The main problem in your plot is not the ripples, but the harmonics around 1000 Hz, and the subharmonic at 500 Hz. A simulated waveform shouldn't show any of this (for example, see my plot below).

First, you probably want to just try plotting out the raw waveform, and this will likely point to an obvious problem. Also, it seems odd to have a wave unpack to unsigned shorts, i.e. "H", and especially after this to not have a large zero-frequency component.

I was able to get a pretty close duplicate to your FFT by applying clipping to the waveform, as was suggested by both the subharmonic and higher harmonics (and Trevor). You could be introducing clipping either in the simulation or the unpacking. Either way, I bypassed this by creating the waveforms in numpy to start with.

Here's what the proper FFT should look like (i.e. basically perfect, except for the broadening of the peaks due to the windowing)

alt text

Here's one from a waveform that's been clipped (and is very similar to your FFT, from the subharmonic to the precise pattern of the three higher harmonics around 1000 Hz)

alt text Here's the code I used to generate these

from numpy import *
from pylab import ion, plot, draw, show, xlabel, ylabel, figure

sample_rate = 20000.
times = arange(0, 10., 1./sample_rate)
wfm0 = sin(2*pi*200.*times)
wfm1 = sin(2*pi*500.*times) *(10.-times)/10.
wfm = wfm0+wfm1
#  int test
#wfm *= 2**8
#wfm = wfm.astype(int16)
#wfm = wfm.astype(float)
#  abs test
#wfm = abs(wfm)
#  clip test
#wfm = clip(wfm,  -1.2, 1.2)

fft_length = 5*2048.
total_num_samps = len(times)
num_fft = (total_num_samps / fft_length ) - 2
temp = zeros((num_fft,fft_length), float)

for i in range(num_fft):
    temp[i,:] = wfm[i*fft_length:(i+1)*fft_length] 
pts = fft_length/2+1
data = (abs(fft.rfft(temp, fft_length)) / (pts))[:pts]

x_axis = arange(pts)*sample_rate*.5/pts
spec_range = pts
plot(x_axis, data[2], linewidth=3)
xlabel("freq (Hz)")
ylabel('abs(FFT)')
show()

OTHER TIPS

FFT's because they are windowed and sampled cause aliasing and sampling in the frequency domain as well. Filtering in the time domain is just multiplication in the frequency domain so you may want to just apply a filter which is just multiplying each frequency by a value for the function for the filter you are using. For example multiply by 1 in the passband and by zero every were else. The unexpected values are probably caused by aliasing where higher frequencies are being folded down to the ones you are seeing. The original signal needs to be band limited to half your sampling rate or you will get aliasing. Of more concern is aliasing that is distorting the area of interest because for this band of frequencies you want to know that the frequency is from the expected one.

The other thing to keep in mind is that when you grab a piece of data from a wave file you are mathmatically multiplying it by a square wave. This causes a sinx/x to be convolved with the frequency response to minimize this you can multiply the original windowed signal with something like a Hanning window.

It's worth mentioning for a 1D FFT that the first element (index [0]) contains the DC (zero-frequency) term, the elements [1:N/2] contain the positive frequencies and the elements [N/2+1:N-1] contain the negative frequencies. Since you didn't provide a code sample or additional information about the output of your FFT, I can't rule out the possibility that the "noisy power values at non-existent frequencies" aren't just the negative frequencies of your spectrum.


EDIT: Here is an example of a radix-2 FFT implemented in pure Python with a simple test routine that finds the FFT of a rectangular pulse, [1.,1.,1.,1.,0.,0.,0.,0.]. You can run the example on codepad and see that the FFT of that sequence is

[0j,                    Negative frequencies
(1+0.414213562373j),    ^
0j,                     |
(1+2.41421356237j),     |
(4+0j),                <= DC term
(1-2.41421356237j),     |
0j,                     v
(1-0.414213562373j)]    Positive frequencies

Note that the code prints out the Fourier coefficients in order of ascending frequency, i.e. from the highest negative frequency up to DC, and then up to the highest positive frequency.

I don't know enough from your question to actually answer anything specific.

But here are a couple of things to try from my own experience writing FFTs:

  • Make sure you are following Nyquist rule
  • If you are viewing the linear output of the FFT... you will have trouble seeing your own signal and think everything is broken. Make sure you are looking at the dB of your FFT magnitude. (i.e. "plot(10*log10(abs(fft(x))))" )
  • Create a unitTest for your FFT() function by feeding generated data like a pure tone. Then feed the same generated data to Matlab's FFT(). Do a absolute value diff between the two output data series and make sure the max absolute value difference is something like 10^-6 (i.e. the only difference is caused by small floating point errors)
  • Make sure you are windowing your data

If all of those three things work, then your fft is fine. And your input data is probably the issue.

Time doamin clipping shows up as mirror images of the signal in the frequency domain at specific regular intervals with less amplitude.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top