Question

I've got Sphinx-4 installed on my windows XP system and JSAPI set up. I'd like to transcribe an English spoken WAV (or MP3) file to text.

When I run the "WavFile" demo - it runs successfully.

java -jar WavFile.jar

But, when I pass my own wav file like this:

java -jar WavFile.jar c:\test.wav

I get:

Loading Recognizer as defined in 'jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml'...

Decoding jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav Result: one two three four five

It seems this demo is setup to load and run an internal wav file ("12345.wav") or something.

I've read the docs and just can't figure how to setup the "config.xml" or even what directory to place it in. I'm just trying to get a simple proof of concept running using the standard demos.

So, the question is: how do I run a Sphinx4 program to transcribe a wav file?

Thanks.

Was it helpful?

Solution

Not sure if you still need the answer. But I think this link is what you want (but only works for digital data): http://cmusphinx.sourceforge.net/sphinx4/src/apps/edu/cmu/sphinx/demo/transcriber/README.html

OTHER TIPS

What's needed is to write a new application (based on Transcriber.java) that uses the CMU Dictionary (American English) instead of the numbers that Transcriber.jar supports.

It is quite strange that Sphinx does not come with such a useful sample.

I know this is a super old thread, but I just wanted to point out that your example seems to have ran perfectly. If you look at the very end of your output:

Decoding jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav Result: one two three four five <========== RESULTS FROM DECODING WAV AUDIO!

Look at the pocketsphinx package. It's written in C, has been compiled for every platform, and can be used as a commandline or as part of an app. I have been working command line with it and it is extraordinarily versatile.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top