Question

I am currently developing a crossplatform app, this should run on a Google GLASS (Android 4.0.4), a smartphone (Android 4.0.4 or newer) and another wearables. At least it will be ICS – Ice Cream Sandwich version.
This app provides me with event-driven different Views, triggered by the user or the system (Network - Event).
For the controlling by the user, I want to implement speech recognition, which just needs to recognize numbers or at least single digits and the commands forward and backward. It is important that it also works offline, it should work in background when the application is running and shouldn’t cover the user interface.
Related Work:
SpeechRecognizer seems to have the offline functionality only with jellybean, (haven’t found a way to use it on Android 4.0.4).
Implementing a custom IME and the use of VoiceTyping seems to me to be very expensive and dirty. (like Utter!, btw. really nice work!)
First attempts to use pocketsphinx haven’t been successful yet.

Was it helpful?

Solution

The offline voice capabilities of Jelly Bean are handled by the Google Search application internally. There has been no change to either the RecognizerIntent or the SpeechRecognizer API.

This isn't ideal for what you want to achieve, as having a dependency to a closed sourced application that isn't cross platform will throw a spanner in the works.... Regardless of that, a simple offline = true parameter is nowhere to be seen and you'll end up having to coerce this behaviour. I have requested this parameter by the way!

Google handle their wake up phrase with a dedicated processor core, but it looks unlikely that the manufacturers intend to expose this functionality to anyone other than OEMs.

That leaves other alternative recognition providers, that have RESTful services, such as iSpeech, AT&T and Nuance, but again, you'll be murdering the battery and using significant data if you take this approach. Not to mention the audio conflicts that occur on the Android platform.

Finally, you end up with Sphinx. At present, I consider it the only viable solution to lower the resource usage, but it doesn't get around the audio conflict issues. I've been working on getting it running within my application for a long time, but I still have major issues with false positives that have stopped me including it in production.

It is probably your only option until Google, processor manufactures and OEMs work out how to offer such functionality, without every application installed on the device wanting a piece of the action, which is inevitable.....

I'm not sure this response actually provided and answer, more excludes some!

Good luck

EDIT: In an environment of wearables, such products will have access to the dedicated cores - at least they need to make sure they do and use a processor with such capabilities. From my interaction with companies developing such tech, they often overlook this or are unaware of its necessity.

OTHER TIPS

I want to propose a partial answer to your question. Since you want the speech recognition not to interfere with the UI, you could create a Service, with it you can make it a continuous speech recognizer, avoid the graphical widget and avoid the "beep" sound. I used the following way and worked fine for me: Android Speech Recognition Continuous Service

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top