How can I get all text that I spoke using Speech Recognition (Microsoft.Speech)

https://stackoverflow.com/questions/21513970

06-10-2022
|

Question

I used Microsoft System.Speech to recognize the sound that user speak and convert it to text. But the problem is when the user speak something that need to be defined a grammar for it. But actually I want to get all the wrong words (not defined the grammars) as well.

For example, user say 'gru gru' and I want the text should be 'gru gru'. Could somebody tell me can I do that or not? I have searched a lot of places on internet, but unfortunately it is not work well like I expected.

The code for it as below

        private delegate void DoStuff(); //delegate for the action

        SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));


        protected void Page_Load(object sender, EventArgs e)
        {

        }

        protected void btnRecog_Click(object sender, EventArgs e)
        {
            DoStuff myAction = SomeVeryLongAction;
            myAction.BeginInvoke(null, null);
        }

        protected void btnStop_Click(object sender, EventArgs e)
        {
        }

        private void SomeVeryLongAction()
        {

            recognizer.SpeechRecognized += recognizer_SpeechRecognized;                

            // Grammar dictationGrammar = new DictationGrammar();
            recognizer.RequestRecognizerUpdate();
            Choices colors = new Choices();
            colors.Add(new string[] { "red", "green", "blue", "hello" });

            GrammarBuilder gb = new GrammarBuilder();
            gb.Append(colors);

            // Create the Grammar instance.
            Grammar g = new Grammar(gb);

            recognizer.LoadGrammar(g);


            recognizer.BabbleTimeout = TimeSpan.FromSeconds(10.0);
            recognizer.EndSilenceTimeout = TimeSpan.FromSeconds(10.0);
            recognizer.EndSilenceTimeoutAmbiguous = TimeSpan.FromSeconds(10.0);
            recognizer.InitialSilenceTimeout = TimeSpan.FromSeconds(10.0);

            try
            {
                recognizer.SetInputToDefaultAudioDevice();
                var result = recognizer.Recognize();                    

                lblInfo.Text = result.Text;
            }
            catch (InvalidOperationException exception)
            {

            }
            finally
            {
                recognizer.UnloadAllGrammars();
            } 
        }   


        void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
        {
            var tmp = e.Result.Text;
        }

At the moment, when I spoke something that not defined in the grammar rule like 'love' and the result always return null. I want to have a way to get anything that I speak.

Solution

You can't do that with Microsoft.Speech.Recognition. That SR engine only supports grammars, and doesn't support free-form dictation.

If you want to do that, you will need to switch to System.Speech.Recognition, which supports dictation, but requires higher-quality audio input as well as training.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow