Pregunta

I currently have a problem regarding of Environment Sound Classification. I want to use Audio Classification to detect a specific type of collied sound (which is indeed quite different and very easy to be distinguished by human ears). But there are other types of collied sound that may happened which is not important for me i.e all I need is just not to classified them as my "specific type of collied sound".

I am trying to use GMM&LFCC now to do the classification. One GMM model trained by all the LFCC from that type of collied sound and a GMM model for all other LFCC(either from some non-collision environment sound or from some other type of collision that I don't want). The performance is currently very bad with a very high recall rate but a extremely low precision. I find that although my GMM model for the "specific type of sound" would give a very low probability when the type of sound is not happening, another GMM model for all the sound other than the one I want would also give a low probability if this was the situation that all the other types of collision is happening.

For this kind of situation, should I switch to other model such as ANN or SVM, or I need to add more GMM models? I was thinking about, for instance, GMM_1 for the type of collision I want and GMM_2 for other types of collisions and GMM_3 for anything else. But It is hard for me to get "all other types of collision" also I am not sure if this way will indeed increase accuracy.

¿Fue útil?

Solución

Well, I have to answer my own question. I did some testing these days using the model of having three GMM in the way I mentioned on the question. It still works fine. If I have more training data I am confident that I could reach an accuracy above 90%.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top