Question

I am currently in the discussion phase project with voice recognition, I use the MFCC feature extraction, but the MFCC feature returned from the function is a matrix, e,g. a (20,38) feature matrix for each voice file(wav). But how can I pass this feature to a SVM classifier. For SVM (and other classifier), each sample is represented by a vector, right? but the MFCC feature for each sample is a matrix. Suppose Xi is a MFCC feature for sample i, then the feature for sample i pass to the SVM is: 1) a 20*38 vector, e,g. Xi(:) in matlab form. 2) mean(Xi). 3) one of the column or row in Xi. which way is right? any useful code, paper for this?

thanks! Shine

Était-ce utile?

La solution

For sequence tagging task like speech recognition you need to use combination of SVM and HMM, not just SVM

  1. Align feature matrix to states with GMM-HMM, get feature corresponding to each HMM state
  2. Train SVM on features belonging to each state
  3. Implement SVM-HMM instead fo GMM-HMM

To learn more read

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.442

To make it fast, use existing toolkits like:

http://www.cs.cornell.edu/people/tj/svm_light/svm_hmm.html

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top