How can I use HTML5 to both record audio (speech) and convert the recorded speech into text?

https://stackoverflow.com/questions/12545561

03-07-2021
|

Question

I have a project in mind where I want users to record speech in a browser. I know that for recording audio I can use getUserMedia and for speech to text input I can use x-webkit-speech. I'm OK with the browser limitation. Is there a way that I can do this in a single step?

I'd prefer an HTML5 solution, but I'm willing to go with javascript if that's the only way to do it. I'm also willing to consider server side solutions if necessary (LAMP environment). This would likely be only accessed on a laptop/desktop browser, but if I can also make it compatible with mobile devices, that would be great, too.

Solution

Well getUserMedia kind of is the HTML5 solution as it's part of WebIDL wich might become part of ECMAScript which is kind of part of HTML5s new scripting API.

Besides that you might want to take a look into https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html

All of that is statused as draft und under heavy development.

x-webkit-speech is actually not a too poor chois, especially on mobile devices that can't run heavy javascript.

Please keep me updated about your progress. There are not many people working with all that new human browser interaction APIs.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow