문제

   function [ samples,y, energies] = energy( speech, fs )
   window_ms = 200;
   threshold = 0.75;

   window = window_ms*fs/1000;
   speech = speech(1:(length(speech) - mod(length(speech),window)),1);
   samples = reshape(speech,window,length(speech)/window);
   energies = sqrt(sum(samples.*samples))';

   vuv = energies > threshold;
   y=vuv;

I have this matlab code and I need to write this code in c#. However I couldn't understand the last part of the code. Also i think speech corresponds to a data List or array according to first part of code. If it does not, please can someone explain what this code is doing. I just want to know logic. fs = 1600 or 3200;

도움이 되었습니까?

해결책

The code takes an array representing a signal. It then breaks it into pieces according to a window of specified length, compute the energy in each segment, and finds out which segments have energy above a certain threshold.

Lets go over the code in more details:

speech = speech(1:(length(speech) - mod(length(speech),window)),1);

the above line is basically making sure that the input signal's length is a multiples of the window length. So say speech was an array of 11 values, and window length was 5, then the code would simply keep only the first 10 values (from 1 to 5*2) removing the last remaining one value.

The next line is:

samples = reshape(speech,window,length(speech)/window));

perhaps it is best explained with a quick example:

>> x = 1:20;
>> reshape(x,4,[])
ans =
     1     5     9    13    17
     2     6    10    14    18
     3     7    11    15    19
     4     8    12    16    20

so it reshapes the array into a matrix of "k" rows (k being the window length), and as many columns as needed to complete the array. So the first "K" values would be the first segment, the next "k" values are the second segment, and so on..

Finally the next line is computing the signal energy in each segment (in a vectorized manner).

energies = sqrt(sum(samples.*samples))';

다른 팁

List<int> speech = new List<int>();

int window = 0;

int length = speech.Count();

int result = length % window;

int r = length - result;

// speech = speech(1: r, 1)

This:

(length(speech) - mod(length(speech),window)

is a formula

([length of speech] - [remainder of (speech / window)])

so try

(length(speech) - (length(speech) % window))

% is the symbol equivalent to mod(..)

EDIT I should say that I assume that is what mod(..) is in your code :)

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top