Preprocessing audio in android Speech Input recognizer

https://stackoverflow.com/questions/10572396

08-06-2021
|

Question

I'm doing some basic command recognition and using Google Search Input API for that. However I want to capture audio myself, preprocess the audio (denoise, boost amplitude, etc), send those modified audio to the recognizer and obtaining results. Is it possible?

I know you can use SpeechRecognizer along with RecognitionListener to obtain audio using onBufferReceived method. However I want to do preprocessing instead of postprocessing. Is there any workaround/hack to feed google recognizer with processed data?

Solution

preprocess the audio (denoise, boost amplitude, etc), send those modified audio to the recognizer and obtaining results

Usually speech recognition systems suffer from this. Incorrectly implemented denoising can lower speech recognitoin accuracy because it corrupts spectrum in unpredictable way. Amplitude boosting doesn't help because amplitude is normalized on the very beginning of speech recognition. Your preprocessing can only hurt.

If you still want to try it, try pocketsphinx

http://cmusphinx.sourceforge.net/2011/05/building-pocketsphinx-on-android/

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow