Frage

I’m trying to develop an application that is capable of identifying a sound clip of an animal. What I’m doing is that I’m taking in an AMR recording and reading the byte array from it and sending those data through FFT and calculate amplitudes accordingly.

AMR file sample frequency 8 KHz (Standard AMR of 15 seconds)

Number of FFT points 4096 for input of 8192 values

Then I calculate amplitude by amplitude=2 * FFT point value/8192

So my intention now is to get a spike at the frequency related to the highest amplitude, The issue is that the spike at the highest aplitude is not Consistant for the same animal's some other sound clip. For another sound clip the frequency related to the highest amplitude changes. Is there a reason for this?. Any help and guidance for this will be appreciated. Thanks in advance.

War es hilfreich?

Lösung

your file has a sample frequency of 8KHz, but I think that the average human hearing frequency is of somewhat 20KHz, so are you sure that your are respecting the nyquist frequency of your samples (.wav files usually have a sample rate of at least 48KHz)?

The nyquist frequency states that if you are to sample a given signal you must use a sample frequency that is at least twice the maximum frequency the given signal.

Also, the same animal can and will make different sounds, so your average frequency will never be the same for two different samples. Do you have a tolerance threshold that accounts for different average frequencies?

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top