Pergunta

Web Speech API specification says:

text attribute
This attribute specifies the text to be synthesized and spoken for this utterance. This may be either plain text or a complete, well-formed SSML document. For speech synthesis engines that do not support SSML, or only support certain tags, the user agent or speech engine must strip away the tags they do not support and speak the text.

It does not provide an example of using text with an SSML document.

I tried the following in Chrome 33:

var msg = new SpeechSynthesisUtterance();
msg.text = '<?xml version="1.0"?>\r\n<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">ABCD</speak>';
speechSynthesis.speak(msg);

It did not work -- the voice attempted to narrate the XML tags. Is this code valid?
Do I have to provide a XMLDocument object instead?

I am trying to understand whether Chrome violates the specification (which should be reported as a bug), or whether my code is invalid.

Foi útil?

Solução

In Chrome 46, the XML is being interpreted properly as an XML document, on Windows, when the language is set to en; however, I see no evidence that the tags are actually doing anything. I heard no difference between the <emphasis> and non-<emphasis> versions of this SSML:

var msg = new SpeechSynthesisUtterance();
msg.text = '<?xml version="1.0"?>\r\n<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><emphasis>Welcome</emphasis> to the Bird Seed Emporium.  Welcome to the Bird Seed Emporium.</speak>';
msg.lang = 'en';
speechSynthesis.speak(msg);

The <phoneme> tag was also completely ignored, which made my attempt to speak IPA fail.

var msg = new SpeechSynthesisUtterance();
msg.text='<?xml version="1.0" encoding="ISO-8859-1"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream.  The name is pronounced <phoneme alphabet="ipa" ph="p&aelig;v&#712;lo&#650;v&#601;">...</phoneme> or <phoneme alphabet="ipa" ph="p&#593;&#720;v&#712;lo&#650;v&#601;">...</phoneme>, unlike the name of the dancer, which was <phoneme alphabet="ipa" ph="&#712;p&#593;&#720;vl&#601;v&#601;">...</phoneme> </speak>';
msg.lang = 'en';
speechSynthesis.speak(msg);

This is despite the fact that the Microsoft speech API does handle SSML correctly. Here is a C# snippet, suitable for use in LinqPad:

var str = "Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream.  The name is pronounced /pævˈloʊvə/ or /pɑːvˈloʊvə/, unlike the name of the dancer, which was /ˈpɑːvləvə/.";
var regex = new Regex("/([^/]+)/");
if (regex.IsMatch(str))
{
    str = regex.Replace(str, "<phoneme alphabet=\"ipa\" ph=\"$1\">word</phoneme>");
    str.Dump();
}   
SpeechSynthesizer synth = new SpeechSynthesizer();
PromptBuilder pb = new PromptBuilder();
pb.AppendSsmlMarkup(str);
synth.Speak(pb);

Outras dicas

There are bugs for this issue currently open with Chromium.

  • 88072: Extension TTS API platform implementations need to support SSML
  • 428902: speechSynthesis.speak() doesn't strip unrecognized tags This bug has been fixed in Chrome as of Sept 2016.

I've tried this using Chrome 104.0.5112.101 (on Linux). Didn't work. When checking the debugging console I got the message:

speechSynthesis.speak() without user activation is deprecated and will be removed

Adding a button like mentioned in The question of whether speechSynthesis is allowed to run without user interaction does work for me. At least to speak out text, not SSML formatted text though.

I have tested this, and XML parsing seems to work properly in Windows, however it does not work properly in MacOS.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top