How can I graph the intonation of a voice sample?

https://stackoverflow.com/questions/12188049

29-06-2021
|

Question

I want to make an iOS app that allows me to graph the intonation (the rise and fall of the pitch of their voice) of an audio sample as read in by the user. Intonation is very important in various languages around the world and this would be an attempt to practice intonation as well as pronunciation.

I am not very versed in the world of speech/audio technology, so what do I need? Are there libraries that come installed with Cocoa-touch that gives me the ability to access the data I need from a voice sample? What exactly am I going to be looking to capture?

If anyone has an idea of the technology I am going to need to leverage, I would appreciate a point in the right direction.

Thanks!

Solution

What you're looking for is called formant analysis.

Formants are, in essence, the spectral peaks of the uttered sounds. They are listed in order of frequency, as in f1, f2, etc. Seems to me that what you're looking to plot is f1.

Formant analysis is at the core of speech recognition, usually f1 and f2 are enough to differentiate vowels apart. I'd recommend you do a search on formant analysis algorithms and take it from there.

Good luck :)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow