Peak detection in Performous code

https://stackoverflow.com/questions/3438429

27-09-2019
|

문제

I was looking to implement voice pitch detection in iphone using HPS method. But the detected tones are not very accurate. Performous does a decent job of pitch detection.

I looked through the code but i did not fully get the theory behind the calculations. They use FFT and find the peaks. But the part where they use the phase of FFT output, got me confused.I figure they use some heuristics for voice frequencies.

So,Could anyone please explain the algorithm used in Performous to detect pitch?

해결책

[Performous][1] extracts pitch from the microphone. Also the code is open source. Here is a description of what the algorithm does, from the guy that coded it (Tronic on irc.freenode.net#performous).

PCM input (with buffering)
FFT (1024 samples at a time, remove 200 samples from front of the buffer afterwards)
Reassignment method (against the previous FFT that was 200 samples earlier)
Filtering of peaks (this part could be done much better or even left out)
Combining peaks into sets of harmonics (we call the combination a tone)
Temporal filtering of tones (update the set of tones detected earlier instead of simply using the newly detected ones)
Pick the best vocal tone (frequency limits, weighting, could use the harmonic array also but I don't think we do)

I still wasn't able from this information to figure it out and implement it. If anyone manages this, please please post your results here, and comment this response so that SO notifies me.

The task would be to create a minimal C++ wrapper around this code.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow