Pregunta

I write the program removing vocals from song using fft. Before C# I decided to test the algorithm of reduce frequency in Matlab, but can't get result as in example. There's a noise. I've tried select any range (0.7 - 1.5), but all the same...noise. What I do not? Please, help me to write it right) Thanks in advance!

[y, fs] = wavread('Song.wav');
left = y(:,1);
right = y(:,2);
fftL = fft(left);
fftR = fft(right);

for i = 1:683550 %in my example 683550
  dif = fftL(i,1) / fftR(i,1);
  dif = abs(dif);
  if (dif > 0.7 & dif < 1.5)
    fftL(i,1) = 0;
    fftR(i,1) = 0;
  end;
end;

leftOut = ifft(fftL);
rightOut = ifft(fftR);
yOut(:,1) = leftOut;
yOut(:,2) = rightOut;

wavwrite(yOut, fs, 'tmp.wav');
¿Fue útil?

Solución

From the code I can see that you simply classify frequency content as being a vocal if it is "equal" in strength between left and right (equal being defined as a ratio in between 0.7 and 1.5). I'm not familiar with your reasons for this scheme, but it may actually yield a decent result.

What you are doing wrong does most likely have to do with fft size and the fact that you are treating the complete signal in one go, so to say.

Vocals in e.g. a song vary over time, therefore your masking has to vary as well. What this means is that you have to break up your signal in frames in the time-domain and do your fft and masking separately for each frame. Also you should consider to use an overlap in your time-domain framing.

Regards

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top