Pregunta

I used Sphinx4 for some time which really fits my needs. I load a recognizer, pass the audio data to it and use the recognized String in my application.

Right now I'm working on a C application (C++ is unfortunately not an option) where I need something similar and thought that I could use Sphinx3 which is written in C.

The problem is that I don't really know how it is used inside an application and there is no "Hello World"-example as Sphinx4 provides it.

I already compiled and installed sphinxbase and sphinx3 and now I can include the sphinx header files in my application.

Now to my questions:

  • Is there a "simple" and well documented example application that uses sphinx3 from a C environment?
  • How can I load up the sphinx3 engine and call a recognizer with my binary audio data?
  • OR: Do I need to start an application like "sphinx3_decode" and call it from my own application? If so, is there an example application for that?

Thank you in advance!

Best regards, Robert

¿Fue útil?

Solución

It's not recommended to use Sphinx3. From the website:

Sphinx-3 is CMU’s large vocabulary speech recognition system. It’s older C based decoder that we continue to maintain. It’s planned to make it obsolete in the future, it’s still most accurate decoder for large vocabulary tasks. We are using it as a baseline to check the recognizer accuracy. This decoder is only intended for researchers who want to evaluate bleeding edge methods in ASR like tree search method.

If you need to use a decoder you should use pocketsphinx. You can find the tutorial and the API documentation on the website

http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx

http://cmusphinx.sourceforge.net/api/pocketsphinx/pocketsphinx_8h.html

Otros consejos

I Recently worked on an Intregated Project on Punjabi Language. Here are some steps that we used...

  • First we recorded the punjabi audio data in a vaccumed room in 16000 hz sample rate.
  • Then we took the recorded data and segmented it using Praat Software into small wav and raw files of 2 to 30 sec and saved them in a folder named train.
  • Then we took a system having Linux ie. Ubuntu and installed the required plug in like autoconfig, automake etc and untarred Sphinx 3 along with 4 packages that are cmuclmtk, pocketsphinx, sphinxbase, sphinxtrain.
  • Then according to the small wav files we made many files like transcription, dic, phone, filler, file id, ccs etc.
  • Then we opened the terminal and typed –"sphinx_fe” to check the whether the sphinx is functional or not.
  • Then we created an folder named “man” and then in terminal wrote its path.
  • Then we run the command- “sphinxtrain –t man setup”. By running this command an folder named “etc” will be formed in “man” folder containing files “feat_paramas” & ”config”.
  • Changes were made in the in the config file according to our data.
  • Then we moved all the files that we created before ie. transcription, dic in the etc folder in that is located in man folder.
  • Then we placed ‘lang1.sh” script in etc folder and remaining 4 scripts in man folder.
  • Then we opened the path for etc folder in terminal and run command- “lang1.sh”
  • Then we run series of commands in terminal – “mfcgen2.sh” then “verify3.sh” then “hmm4.sh” and at last “end-test.sh” to get the final result.

Rest if you have worked on Sphinx 4 then you may know about the files that are mentioned above in the steps. I hope this helps you.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top