Pregunta

I would like to see if it's possible to have direct access to Opus using getUserMedia or anything similar from the latest browsers.

I've been researching on it a lot but with no Good results.

I'm aware that either Opus or Speex are actually used in webkitSpeechRecognition API. I would like to do speech recognition but using my own server rather than Google's.

¿Fue útil?

Solución

So there are a lot of suggestions about Emscripten but nobody did, so I ported the encoder opus-tools to JavaScript using Emscripten. Dependent on what one has in mind, there are now the following opportunities:

Otros consejos

We're using emscripten for encoding and decoding using gsm610 with getUserMedia, and it works incredibly well, even on mobile devices. These days javascript gives almost native performance, so emscripten is viable for compiling codecs. The only issue is potentially very large .js files, so you want to only compile the parts you are using.

Unfortunately, it isn't currently possible to access browser codecs directly from JavaScript for encoding. The only way to do it would be to utilize WebRTC and set up recording on the server. I've tried this by compiling libjingle with some other code out of Chromium to get it to run on a Node.js server... it's almost impossible.

The only thing you can do currently is send raw PCM data to your server. This takes up quite a bit of bandwidth, but you can minimize that by converting the float32 samples down to 16 bit (or 8 bit if your speech recognition can handle it).

Hopefully the media recorder API will show up soon so we can use browser codecs.

This is not a complete solution, @Brad's answer is actually the correct one at this time.

One way to do it is to compile Opus to Emscripten and hope that your PC can handle encoding using JavaScript. Another alternative is to use speex.js.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top