Question

Do all voice-to-text algorithms of current technology operate in real-time? I don't mean with a person sitting at a computer with a microphone, but rather inputing a pre-recorded audio file.

i.e. If you have a 30 minute voice recording, will it always take 30 minutes to be transcribed?

Are there different approaches out there?

Was it helpful?

Solution

There is no reason why speech recognition must take as long as the length of audio to be performed. However, due to the computation required I don't think that you won't get a hugely faster than real time algorithm. See this section of the Wikipedia article for more detailed information (it doesn't seem to give any times, though it does give a decent overview of performance).

OTHER TIPS

There is nothing stopping the algorithm running faster than realtime. The Naturally Speaking 10 Professional software program provide a "transcribe from file" option for converting dictation taken while away from a computer, this operates as fast as the computer on which it is running can achieve.

I believe batch processing implementations exist in the area of signals intelligence but such programs would, naturally, be unavailable to the general populace.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top