Failed to align audio to transcript during adaptation

https://stackoverflow.com/questions/20726032

20-09-2022
|

Pregunta

I am trying to adapt the acoustic model for use with Sphinx4 with some of my own transcribed data. The data I am using for adaptation is 8kHz so I changed the params file in the original acoustic model (which used 16kHz audio) because I use it throughout the adaptation process:

-lowerf 200.00
-upperf 3500.00
-nfilt 31
-ncep 13
-transform legacy
-round_filters yes
-unit_area yes
-remove_dc no
-feat 1s_c_d_dd

Feature extraction appears to work fine but the Baum Welch is resulting in many errors. For reference, the Baum Welch command line parameters are shown below:

-hmmdir ../hub4opensrc.cd_continuous_8gau -moddeffn ../hub4opensrc.cd_continuous_8gau/mdef.txt -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -dictfn ../adaptationData.dict -ctlfn ../adaptationData.listoffiles -lsnfn ../adaptationData.transcription -accumdir .

And the errors are the same for every file (one of which is shown below):

INFO: cmn.c(175): CMN:  9.69  0.13 -0.11 -0.13 -0.19 -0.23 -0.25 -0.19 -0.22 -0.19 -0.10 -0.09 -0.07 
ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
ERROR: "baum_welch.c", line 324: sn74tiCEB6F7DE7672F ignored
utt>   233       sn74tiCEB6F7DE7672F  177    0   112 12  utt 0.000x 0.000e upd 0.000x 0.000e fwd 0.000x 0.000e bwd 0.000x 0.000e gau 0.000x 0.000e rsts 0.000x 0.000e rstf 0.000x 0.000e rstu 0.000x 0.000e

I'm very confused why the algorithm does not complete and am wondering if anyone has any suggestions as to how to overcome this if you have also run into this issue before.

Solución

Hub4 is 16khz acoustic model, you can not adapt it to recognize 8khz audio. You need to adapt narrow bandwidth acoustic model. For example you can adapt communicator continuous model from downloads or wsj_8khz model from sphinx4.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow