How to mix Grammar (Rules) & Dictation (Free speech) with SpeechRecognizer in C#
-
27-09-2019 - |
Question
I really like Microsofts latest speech recognition (and SpeechSynthesis) offerings.
http://msdn.microsoft.com/en-us/library/ms554855.aspx
http://estellasays.blogspot.com/2009/04/speech-recognition-in-cnet.html
However I feel like I'm somewhat limited when using grammars.
Don't get me wrong grammars are great for telling the speech recognition exactly what words / phrases to look out for, however what if I want it to recognise something i've not given it a heads up about? Or I want to parse a phrase which is half pre-determined command name and half random words?
For example..
Scenario A - I say "Google [Oil Spill]" and I want it to open Google with search results for the term in brackets which could be anything.
Scenario B - I say "Locate [Manchester]" and I want it to search for Manchester in Google Maps or anything else non pre-determined
I want it to know that 'Google' and 'Locate' are commands and what comes after it are parameters (and could be anything).
Question : Does anyone know how to mix the use of pre-determined grammars (words the speech recognition should recognise) and words not in its pre-determined grammar?
Code fragments..
using System.Speech.Recognition;
...
...
SpeechRecognizer rec = new SpeechRecognizer();
rec.SpeechRecognized += rec_SpeechRecognized;
var c = new Choices();
c.Add("search");
var gb = new GrammarBuilder(c);
var g = new Grammar(gb);
rec.LoadGrammar(g);
rec.Enabled = true;
...
...
void rec_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result.Text == "search")
{
string query = "How can I get a word not defined in Grammar recognised and passed into here!";
launchGoogle(query);
}
}
...
...
private void launchGoogle(string term)
{
Process.Start("IEXPLORE", "google.com?q=" + term);
}
Solution
You could try something like this... It specifies a list of known commands.. but also lets you use open dictation afterwards. It expects there to be a command given before the open dictation.. but you could reverse this... and append th However, by adding in a blank in the command type (" ") it will also let you get straight to the dictation part.
Choices commandtype = new Choices();
commandtype.Add("search");
commandtype.Add("print");
commandtype.Add("open");
commandtype.Add("locate");
SemanticResultKey srkComtype = new SemanticResultKey("comtype",commandtype.ToGrammarBuilder());
GrammarBuilder gb = new GrammarBuilder();
gb.Culture = System.Globalization.CultureInfo.CreateSpecificCulture("en-GB");
gb.Append(srkComtype);
gb.AppendDictation();
Grammar gr = new Grammar(gb);
then on your recognizer just use the result text etc
private void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
System.Console.WriteLine(e.Result.Text);
}
You can add more choice options, and SemanticResultKeys to the structure to make more complex patterns if you wish. Also a wildcard (e.g. gb.AppendWildcard(); ).
OTHER TIPS
You have two choices:
- You can use the dictation node for free-text using GrammarBuilder::AppendDictation. The problem is that since the recognizer doesn't have any context, the recognitions aren't the highest quality.
- You can use a textbuffer node and provide a set of items using GrammarBuilder::Append(String, SubsetMatchingMode). This will give the recognizer enough context to get good quality recognitions without having to rebuild the entire grammar tree every time.