Question

I'm trying to use Windows Phone 8 speech recognition to recognize custom pronunciation of words. I'm try to use the samples provided on MSDN, but am coming up short. First of all, I'm using a lexicon file (.pls) because the "sapi" namespace for inline pronunciations is failing (for both pron and display attributes) - but maybe I'll save that for a different question. So anyway, here's what I have:

<?xml version="1.0" encoding="utf-8" ?>
<grammar version="1.0" xml:lang="en-US"  tag-format="semantics/1.0" root="thecolor"
         xmlns="http://www.w3.org/2001/06/grammar" >
  <lexicon uri="ms-appx:///SRGSLexicon.pls" />
  <rule id="thecolor">
    <item>blue</item>
  </rule>
</grammar>

That's my SRGS grammar. I load it like this:

    Dim SRGSGrammar As Uri = New Uri("ms-appx:///SRGSGrammar.xml", UriKind.Absolute)
    _myRecognizer.Grammars.AddGrammarFromUri("SRGSGrammar", SRGSGrammar)

I've also tried adding type="application/pls+xml" to the lexicon element, but that gives a format exception.

Seems to work just fine. Notice the <lexicon/> tag in though. Here's my PLS file:

<?xml version="1.0" encoding="utf-8" ?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      alphabet="x-microsoft-ups" xml:lang="en-US">
  <lexeme>
    <grapheme> blue </grapheme>
    <phoneme> W S1 AX T CH AX M AX K S2 AA L IH T </phoneme>
  </lexeme>
</lexicon>

(Note: both of these files are in my app's root, set to Content and Copy if Newer).

Then I hit a button called "speak", which does Dim recoResult = Await _myRecognizer.RecognizeAsync(). I then say whatchamacallit and it gives me very low confidence and says the rule used is "thecolor" and the text is "blue". It doesn't even use the PLS as far as I can see. If I do this again and this time say blue, I get close to 100% confidence.

I want whatchamacallit in the PLS to be recognized, not blue in the SRGS grammar, but the only thing that gets very high confidence is "blue" (99%) and that is also the result text.

My PLS appears to load (I can't be 100% sure, but any URI other than the one I give above causes a FileNotFound exception, so that's why I think it is loading).

Note - How do I use a lexicon with SpeechSynthesizer? is not what this question is about, although we both use whatchamacallit example in the PLS. Also, Using SSML for advanced text-to-speech on Windows Phone 8 gave me some hope as it's the only implementation of a PLS I've seen, but alas it is for a different technology and doesn't work in my example.

Has anyone got custom pronunciations to work in WP8 via a PLS file (or inline in <Token/> with sapi)? If so, can you help?

Était-ce utile?

La solution

Todd, I tried to replicate your problem since I had a strong suspicion it had something to do with the uri-scheme. I didn't have your complete code but was able to replicate it by just putting the grammar and lexicon files in the root folder of the app's local storage.

When I used type="application/pls+xml" in C#, I didn't get the 80045003 error. Rather, I kept getting this:

WinRT information: Grammar error found: C:\Data\Users\DefApps\AppData{A7C75BFD-F873-4DA9-834C-C4CA3D97AA6B}\Local\SRGSGrammar.xml, line 4: Cannot compile lexicon file "ms-appdata:///local/SRGSLexicon.xml": 0x80004003

which I think is a file pointer not found error. And when I paid closer attention to the error message, I noticed the file paths that the parser thinks it has for the grammar file and the lexicon file is different, even tho I was using "ms-appdata:///" to reference both files.

It turns out that the grammar parser probably cannot accept ANY of the special uri-schemes. I used the full path of the file path from the error message for the PLS file uri attribute and that worked. You'll notice I'm still using type="application/pls+xml"

So I'm not sure that this work around is an acceptable solution... but I believe this gets to the root of the problem.

This is the code (in C#) that makes this work

SRGSLexicon.pls (unchanged)

SRGSGrammar.xml (using a file path rather than the uri-scheme)

<?xml version="1.0" encoding="utf-8" ?>
<grammar version="1.0" xml:lang="en-US"  tag-format="semantics/1.0" root="thecolor"
         xmlns="http://www.w3.org/2001/06/grammar" >
  <lexicon uri="C:\Data\Users\DefApps\AppData\{A7C75BFD-F873-4DA9-834C-C4CA3D97AA6B}\Local\SRGSLexicon.pls" type="application/pls+xml" />
  <rule id="thecolor">
    <item>blue</item>
  </rule>
</grammar>

My app code (C#)

    public MainPage()
    {
        InitializeComponent();
        var srgsGrammar = new Uri("ms-appx:////SRGSGrammar.xml", UriKind.Absolute);
        _recognizerUi.Recognizer.Grammars.AddGrammarFromUri("SRGSGrammar", srgsGrammar);
    }

    readonly SpeechRecognizerUI _recognizerUi = new SpeechRecognizerUI();

    private async void Test_OnClick(object sender, RoutedEventArgs e)
    {
        //I used these next 2 lines to show the FilePath of the SRGSGrammar.xml file, and I used the same folder
        //structure for the lexicon pls file uri (just changed the file name)
        //var fileName = (await StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appdata:///local/SRGSGrammar.xml"))).Path;
        //MessageBox.Show(fileName);

        var recoResult = await _recognizerUi.RecognizeWithUIAsync();
        var x = recoResult.RecognitionResult.TextConfidence;
        MessageBox.Show(((int)x).ToString()); //show confidence
    }

I hope this helps at all. I think the grammar parser just doesn't know what to do with the uri-scheme.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top