How would you compare a spoken word to an audio file?

https://stackoverflow.com/questions/4255359

27-09-2019
|

문제

How would you go about comparing a spoken word to an audio file and determining if they match? For example, if I say "apple" to my iPhone application, I would like for it to record the audio and compare it with a prerecorded audio file of someone saying "apple". It should be able to determine that the two spoken words match.

What kind of algorithm or library could I use to perform this kind of voice-based audio file matching?

해결책

Sphinx does voice recognition and pocketSphinx has been ported to the iPhone by Brian King

check https://github.com/KingOfBrian/VocalKit

He has provided excellent details and made it easy to implement for yourself. I've run his example and modified my own rendition of it.

다른 팁

You should look up Acoustic Fingerprinting see wikipedia link below. Shazam is basically doing it for music.

http://en.wikipedia.org/wiki/Acoustic_fingerprint

I know this question is old, but I discovered this library today:

http://www.ispikit.com/

You can use a neural networks library and teach it to recognize different speech patterns. This will require some know how behind the general theory of neural networks and how they can be used to create systems that will behave a particular way. If you know nothing about the subject, you can get started on just the basics and then use a library rather than implementing something yourself. Hope that helps.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow