Voice recogntion engines for embedded applications

https://stackoverflow.com/questions/1862533

16-09-2019
|

Question

I am trying to research available voice recognition engines and SDK for developing a Windows CE voice enabled application. I've run across Nuance, but don't see much of anything else. I would prefer a .Net SDK if possible, but I imagine most would be C/C++. I appreciate any suggestions. Thanks.

Solution 5

As stated in one of my comments above, we are trying a voice recognition .Net SDK from Vangard Voice Systems. It uses Nuance's Vocon3200 voice recognition engine which is well respected and seems to work well in early testing. We're using a cheap microphone right now and have some issues with outside noise. Hopefully that will be resolved with noise-cancelling headsets. The software model is a bit lacking in that it basically hooks into an existing non-voice application. There are some limitations due to this fact and there is a limited API accessible by the developer. Any time you try to oversimplify something like this, you make crafting a powerful solution much more difficult. With that being said, we really couldn't find any competing product that serves our needs of a .Net SDK for voice enablement of mobile applications. They currently have a nice little niche carved out.

I would have preferred to go with Nuance's C++ SDK (for which another company has written .Net wrappers), but the Nuance business model assumes we're developing a product for resale and has some significant royalties involved. A real barrier for a company that wants to develop internal applications.

OTHER TIPS

Nuance has basically bought everyone up. They rule the speech market, I am afraid...

There are a few other companies that deal in the technology, but I don't know how well they do in the embedded market. There is telisma and Loquendo, both which have strong non-English presences (and their English isn't too bad either).

Then there is still IBM. They have ViaVoice Embedded.

One of the big things the industry is waiting for is to see what comes out of Microsoft's acquisition of TellMe, but I think the embedded market they might stay away from instead of pushing the processing to the "cloud", which is where TellMe has been for a long time.

I work with IVR applications; in addition to Nuance we're currently evaluating Microsoft, IBM, and Lumenvox.

The voice recognition applications included on most cell phones are designed to match voice input to a previously spoken phrase, such as assigning the phrase "Joe" to an address book entry and having your phone dial that address book entry when you say "Joe". The more powerful speech recognition engines try to decipher freeform speech by breaking a phrase down into phonemes, and then matching against an acoustic repository to try to figure out what was actually said. A full blown speech recognition engine requires a fair amount of CPU horsepower; to do anything complex with voice recognition on a mobile device, you'll probably need to send data from the device to a server for processing.

Try looking into Microsoft's Speech API, http://msdn.microsoft.com/en-us/library/ms897381.aspx

I believe it runs on CE devices.

There is also the open source project CMU Sphinx . They have a variant called PocketSphinx that has been targeted for portable devices.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow