Can MFCC feature extraction resulted matrix have negative value?

https://stackoverflow.com/questions/22810534

26-06-2023
|

Domanda

I am using MFCC to extract feature to implement a Speech Recognizer I am stuck with HMM implementation. I am using Kevin Murphy Toolbox for HMM. My MFCC resultant matrix contains negative values can this be the case I am getting, can my MFCC code be wrong. Following is the error I am getting-

Attempted to access obsmat(:,-39.5403); index must be a positive integer or logical.

Error in multinomial_prob (line 19)
  B(:,t) = obsmat(:, data(t));

Error in dhmm_em>compute_ess_dhmm (line 103)
 obslik = multinomial_prob(obs, obsmat);

Error in dhmm_em (line 47)
 [loglik, exp_num_trans, exp_num_visits1, exp_num_emit] = ...

Error in speechreco (line 77)
[LL, prior2, transmat2, obsmat2] = dhmm_em(dtr{1}, prior, A, B, 'max_iter', 5);

Also if anyone know link to any Matlab source code for HMM please provide I am stuck with my final project.I am trying to implement Speech Recognizer and don't know what to do after extracting feature vector.

This is whole MatLab code (i am using kevin murphy HMM Toolkit, error is in dhmm_em function):

    function []=speechreco()

vtr = {8}; fstr = {8}; nbtr = {8};
ctr = {8};

for i = 1:8

    % Read audio data from train folder for performing operations
    st=strcat('train\s',num2str(i),'.wav');
    [s1 , fs1 , nb1]=wavread(st);  %st is filename; s1 is sample data, fs1 is frame rate in hertz, nb1 is number of bits per sample 
    vtr{i} = s1; fstr{i} = fs1; nbtr{i} = nb1;

    ctr{i} = mfcc(vtr{i},fstr{i});

end


display(ctr{1}); %MFCC matrix 20*129

W1 = transpose(ctr{1});

ch1=menu('Mel Space:','Signal 1','Signal 2','Signal 3',...
                        'Signal 4','Signal 5','Signal 6','Signal 7','Signal 8','Exit');
                    if ch1~=9
                        plot(linspace(0, (fstr{ch1}/2), 129), (melfb(20, 256, fstr{ch1})));
                        title('Mel-Spaced-Filterbank');
                        xlabel('Frequency[Hz]');
                    end


%error is here
[LL, prior2, transmat2, obsmat2] = dhmm_em(ctr{1}, prior, A, B, 'max_iter', 5);
plot(LL());

end

%%mfcc
%old one MFCC now
function r = mfcc(s, fs)
m = 100;
n = 256;
frame=blockFrames(s, fs, m, n); %power spectra obtained 
m = melfb(20, n, fs);
n2 = 1 + floor(n / 2);
z = m * abs(frame(1:n2, :)).^2; %apply traingular window
r = dct(log(z));  %take log and then the dct conversion 
end



%% blockFrames Function
% blockFrames: Puts the signal into frames
%
% Inputs: s contains the signal to analize
% fs is the sampling rate of the signal
% m is the distance between the beginnings of two frames
% n is the number of samples per frame
%
% Output: M3 is a matrix containing all the frames

function M3 = blockFrames(s, fs, m, n)
l = length(s);
nbFrame = floor((l - n) / m) + 1;
for i = 1:n
    for j = 1:nbFrame
        M(i, j) = s(((j - 1) * m) + i); %#ok<AGROW>
    end
end
h = hamming(n);
M2 = diag(h) * M;
for i = 1:nbFrame
    M3(:, i) = fft(M2(:, i)); %#ok<AGROW>
end
end
%--------------------------------------------------------------------------

function m = melfb(p, n, fs)  %used for graph plot of power spectra
% MELFB Determine matrix for a mel-spaced filterbank 
% 
% Inputs: p number of filters in filterbank 
% n length of fft 
% fs sample rate in Hz 
% 
% Outputs: x a (sparse) matrix containing the filterbank amplitudes 
% size(x) = [p, 1+floor(n/2)] 
% 
% Usage: For example, to compute the mel-scale spectrum of a 
% colum-vector signal s, with length n and sample rate fs: 
% 
% f = fft(s); 
% m = melfb(p, n, fs); 
% n2 = 1 + floor(n/2); 
% z = m * abs(f(1:n2)).^2; 
% 
% z would contain p samples of the desired mel-scale spectrum 
%%%%%%%%%%%%%%%%%% 
%
f0 = 700 / fs; 
fn2 = floor(n/2); 
lr = log(1 + 0.5/f0) / (p+1); 
% convert to fft bin numbers with 0 for DC term 
bl = n * (f0 * (exp([0 1 p p+1] * lr) - 1)); 
b1 = floor(bl(1)) + 1; 
b2 = ceil(bl(2)); 
b3 = floor(bl(3)); 
b4 = min(fn2, ceil(bl(4))) - 1; 
pf = log(1 + (b1:b4)/n/f0) / lr; 
fp = floor(pf); 
pm = pf - fp; 
r = [fp(b2:b4) 1+fp(1:b3)]; 
c = [b2:b4 1:b3] + 1; 
v = 2 * [1-pm(b2:b4) pm(1:b3)]; 
m = sparse(r, c, v, p, 1+fn2); 
end
%----------------------------------------------------------------------

Soluzione

The error is not about the negative value in MFCC, values could be negative. The error says that index is a float value in obsmat, which means you construct the obsmat incorrectly, it has wrong type and you have values and indexes there in the wrong place. You need to share whole code you wrote to reveal error, not just the lines where you invoke hmm training.

Looking on your code I see you probably need to call dhmm_em with ctr, not with ctr{1}.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow