سؤال

I am currently in the discussion phase project with voice recognition, I use the MFCC feature extraction, but the MFCC feature returned from the function is a matrix, e,g. a (20,38) feature matrix for each voice file(wav). But how can I pass this feature to a SVM classifier. For SVM (and other classifier), each sample is represented by a vector, right? but the MFCC feature for each sample is a matrix. Suppose Xi is a MFCC feature for sample i, then the feature for sample i pass to the SVM is: 1) a 20*38 vector, e,g. Xi(:) in matlab form. 2) mean(Xi). 3) one of the column or row in Xi. which way is right? any useful code, paper for this?

thanks! Shine

هل كانت مفيدة؟

المحلول

For sequence tagging task like speech recognition you need to use combination of SVM and HMM, not just SVM

  1. Align feature matrix to states with GMM-HMM, get feature corresponding to each HMM state
  2. Train SVM on features belonging to each state
  3. Implement SVM-HMM instead fo GMM-HMM

To learn more read

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.442

To make it fast, use existing toolkits like:

http://www.cs.cornell.edu/people/tj/svm_light/svm_hmm.html

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top