Fuzzy EmulateRecognize on Windows Speech Recognition

https://stackoverflow.com/questions/18623492

27-06-2022
|

質問

Microsoft C# API provide a SpeechRecognitionEngine to recognize Audio stream. One way to test recogition is to call method SpeechRecognizer.EmulateRecognize

According to documentation:

recognizers ignore case and character width when applying 
grammar rules to the input phrase

I'd like to know if there is a way to handle more fuzzy string because confidence is very low even for mispelled text ! Far from real life...

With Audio I could say Hello, Helo, Helllo with a good confidence
With Text the engine is very strict

EDIT: For what purpose ?

My speech engine is working fine, but I also want to trigger it from text input.

Let's say your on mobile phone and use HTML5 SpeechRecognition. I'd like to send the recognized text to engine to get the same behavior as speech

解決

Ok I found the answer ! I should better read the documentation !

SpeechRecognizer.EmulateRecognize

Is really straightforward and test the given string but

SpeechRecognizer.SimulateRecognize

Will try to build a an ‘idealized’ audio representation of the input phrase (based on the engine's lexicon and acoustic model)

And so it works very well !

他のヒント

When you send audio to the recognizer, the SR engine does a lot of work to create a set of phonemes (via acoustic modeling) and then a set of strings (via phoneme modeling). During that process, much of the ambiguity gets eliminated. EmulateRecognize doesn't generate audio that gets processed via the SR engine; it skips all the modeling and just does a string match.

There's no way to work around this that doesn't involve a lot of work (e.g., implementing a SAPI-compatible SR engine that only does EmulateRecognize).

Enter your string in the SpeechSynthesizer.Speak() and use that as input to SpeechRecognitionEngine?

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow