Question

I'm trying to build a rather longwinded chain of programs and libraries that culminates in using a speech-to-text API to run an mp3 file into human-readable text. I was surprised to find very few APIs that do this online - the only working thing I found was the speech2text project: https://github.com/taf2/speech2text which hooks into Google's unofficial Speech-To-Text API.

This actually worked at first. I did a few manual conversions and was pleased with the results. However, since attempting to automate the chain of processes in Java, it's stopped working properly.

EDIT - the following error messages are technically sourcing from flac itself, not speech2text. Attempting to convert these files using flac only and not speech2text also results in the id3v2 error message, so the error is not really to do with speech2text (although speech2text might be the source of the erroneous tags)

Java reports this as the error (after having called speech2text using a ProcessBuilder and printing out the streams):

/Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_inspector.rb:50:in initialize': undefined methodfirst' for nil:NilClass (NoMethodError) from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_splitter.rb:77:in new' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_splitter.rb:77:ininitialize' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_to_text.rb:15:in new' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_to_text.rb:15:into_text' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/bin/speech2text:11 from /usr/bin/speech2text:19:in `load' from /usr/bin/speech2text:19

However, attempting to run the command manually on the same file actually gives me this:

ERROR: input file ./chunk-abortion-test-audio-0.mp3 has an ID3v2 tag /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_splitter.rb:59:in to_flac': failed to convert chunk: ./chunk-abortion-test-audio-0.mp3 with flac ./chunk-abortion-test-audio-0.mp3 (RuntimeError) from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_to_text.rb:18:into_text' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_to_text.rb:17:in each' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/lib/speech/audio_to_text.rb:17:into_text' from /Library/Ruby/Gems/1.8/gems/speech2text-0.3.4/bin/speech2text:11 from /usr/bin/speech2text:19:in `load' from /usr/bin/speech2text:19

Of course the irony here is that I've actually cleaned the file of id3v2 tags using id3v2 --delete-all at a Mac terminal. So something screwy is going on.

Can anyone suggest what might be happening? Also, given that speech2text hasn't seen an update in a year, I feel like there must be a newer speech-to-text solution that people are using. So if there's something better out there please let me know.

Cheers!

EDIT - Incidentally, if anyone's interested the mp3 file originated from stripping a .flv file using ffmpeg.

Was it helpful?

Solution

This is now two separate problems. The ID3v2 issue I have only been able to resolve by sidestepping the use of .mp3 files and using .wav instead. The Java output is still an issue so I'm shifting that to a new Question.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top