I have taken amplitude data from a 10-second clip of an mp3. I then performed a Fast-Fourier-Transform on it to get the data for the clip in the frequency domain (Shown in the first figure). I would now like to determine what frequencies the peaks are located at.

Amplitude to Frequency

I started by smoothing the data, which can be seen below in the blue and red plots. I created a threshold that the peaks must be over in order to be considered. This is the horizontal blue line on the third plot below. As can be seen, my peak detection code worked, to an extent.

Smoothing and peak detection

The problem that I am having now is evident in the final plot shown below. My code is finding maxima that are local maxima as part of the overall peak. I need a way to filter out these local maxima so that for each peak, I am only getting a single marker. i.e. for the peak shown below I only want a marker at the absolute peak, not at each minor peak along the way.

Enlarged view of peak detection

My peak detection code is shown below:

for i, item in enumerate(xavg): #xavg contains all the smoothed data points
    if xavg[i] > threshold: #points must be above the threshold
        #if not the first or last point (so index isn't out of range)            
        if (i > 0) and (i < (len(xavg)-1)): 
            #greater than points on either side                
            if (xavg[i] > xavg[i-1]) and (xavg[i] > xavg[i+1]):  
                max_locations.append(i)

EDIT: I think I didn't state my problem clearly enough. I want to find the locations of the 5 or so highest spikes on the plot, not just the highest point overall. I am basically trying to give the clip an audio fingerprint by marking its dominant frequencies.

EDIT2: Some more code to help show what I'm doing with regards to the FFT and smoothing:

def movingaverage(interval, window_size):
    window = np.ones(int(window_size))/float(window_size)
    return np.convolve(interval, window, 'same')

fft = np.fft.rfft(song)
xavg = movingaverage(abs(fft), 21)
有帮助吗?

解决方案 2

Peak finding is pretty tricky, I would avoid trying to implement your own code if possible. Try using scipy.signal.find_peaks_cwt, there are a few parameters you can play around with. With this function I think you don't need to smooth the data before hand, since one of the parameters is basically a list of lengths over which to smooth the data. Roughly speaking the algorithm smooths the data on one length scale, looks for peaks, smooths on another length scale, looks for peaks, etc.. then it looks for peaks that appear at all or most length scales.

其他提示

Your values can be partitioned into alternating over-threshold and under-threshold regions. As you find local maxima, keep track of which one is greatest until you the values dip under the threshold again. Set that "regional" maxima aside as a true peak, then continue with the next over-threshold region. Something like:

# Store the true peaks
peaks = []

# If you consider the first value a possible local maxima.
# Otherwise, just initialize max_location to (None, 0)
if xavg[0] > xavg[1]:
    max_location = (0, xavg[0])
else:
    max_location = (None,0) # position and value

# Use a slice to skip the first and last items.
for i, item in enumerate(xavg[1:-1]):
    if xavg[i] > threshold:
        if ((xavg[i] > xavg[i-1]) and
            (xavg[i] > xavg[i+1]) and
            xavg[i] > max_location[1]):
            max_location = (i, xavg[i])
    else:
        # If we found a previous largest local maxima, save it as a true
        # peak, then reset the values until the next time we exceed the threshold
        if max_location[0] is not None:
            peaks.append(max_location[0])
        max_location = None
        max_location_value = 0

# Do you consider the last value a possible maximum?
if xavg[i+1] > xavg[i] and xavg[i+1] > max_location[1]:
    max_location = (i+1, xavg[i+1])

# Check one last time if the last point was over threshold.
if max_location[0] is not None:
    peaks.append(max_location[0])

You are currently making a list (max_locations) of any point where the previous point and next point is below the current point. If you are only interested in the absolute maximum you could do it like this:

xavg.index(max(xavg[startPosition:endPosition]))

Alternatively if you want to keep your code intact, you could check if the current point is larger than any other point before placing itself in the max_location position:

if xavg[i] > xavg[max_location]: max_location = i

There is a LOT of literature about this. From the top of my mind you have at least four options. I recommend particularly options 2 and 3.

  • as suggested by chepner, consider that your peak is the same one until it falls below your threshold.
  • alternatively, if you think that you might have a peak that does not quite fall below the threshold before rising again (double peak), you can calculate the derivative of your FFT. Thus, when the peak falls below the threshol OR the derivative becomes zero (valley in between peaks) you call it a distinct peak.
  • you can always fit functions to your peaks, say Gaussians or Lorentzians. This is probably the most robust approach of the ones I mention.
  • You determine the highest peak, save it and remove the N points around it (N=3,4,5,...you choose), then locate the second peak, and remove the same number of points around it...
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top