Pregunta

I have a matrix of 21000x13 of mfccs from a wav file. I have a label file which has the start time end time and label of that time period in a text file. I need to find the time for each frame in the mfcc matrix so labels can be used for each frame. Does anyone know the sampling rate (30ms/50ms/20ms) and the overlap (30%/40%/50%). So that I can find the time in which each frame fall using the frame number X sampling rate +/- the overlap will give the actual time for the frame. eg. 1x20ms = 20ms and the next frame would be at the time 2x20=40 but will have to consider the overlap here so it will be 30 if 50% overlap.

¿Fue útil?

Solución

Default samping rate is 11025 Hz

Default frame size is the highest power of 2 which is less than 0.03 * sampling rate. For default samping rate the frame size is 256 samples. You can use this formula for calculation:

pow2(floor(log2(0.03*fs)))

Default overlap is 50%.

So the default frame increment is 128 samples. To get the offset you need to multiply the frame number on frame shift (128) and divide by sample rate (11025).

You can find the details in the header here

http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/doc/voicebox/melcepst.html

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top