Why are HMMs appropriate for speech recognition when the problem doesn't seem to satisfy the Markov property

cs.stackexchange https://cs.stackexchange.com/questions/37709

Pregunta

I'm learning about HMMs and their applications and trying to understand their usages. My knowledge is a bit spotty, so please correct any incorrect assumptions I'm making. The specific example I'm wondering about is for using HMMs for speech detection, which is a common example in literature.

The basic method seems to be to treat the incoming sounds (after processing) as observations, where the actual words being spoken are the hidden states of the process. It seems obvious the hidden variables here are not independent, but I do not understand how they satisfy the Markov property. I would imagine that the probability of the Nth word is not just dependent on the N-1 word, but on many preceding words before that.

Is this simply ignored as a simplifying assumptions because HMMs are very good at correctly modeling speech detection problems, or am I not clearly understanding what the states and hidden variables in the process are? The same problem would appear to apply to a great deal of applications in which HMMs are quite popular, POS tagging, and so forth.

No hay solución correcta

Licenciado bajo: CC-BY-SA con atribución
No afiliado a cs.stackexchange
scroll top