Pregunta

I am working on a project that requires me to take an input, perform an DFT (Discrete fourier transform) and then take the number of zero-crossings from these values.

I have coded an algorithm, but, it uses complex numbers and I don't know how to manipulate / perform calculations on them. Here is the code:

#include <iostream>
#include <complex>
#include <vector>

using namespace std;

const double PI = 3.14159265358979323846;

vector< complex<double> > DFT(vector< complex<double> >& theData)
{
    // Define the Size of the read in vector
    const int S = theData.size();

    // Initalise new vector with size of S
    vector< complex<double> > out(S, 0);
    for(unsigned i=0; (i < S); i++)
    {
        out[i] = complex<double>(0.0, 0.0);
        for(unsigned j=0; (j < S); j++)
        {
            out[i] += theData[j] * polar<double>(2, (-2 * PI * i * j / S));
        }
    }

    return out;
}

int main(int argc, char *argv[]) {
    vector< complex<double> > numbers;

    numbers.push_back(128);
    numbers.push_back(127);

    vector< complex<double> > testing = DFT(numbers);

    for(unsigned i=0; (i < testing.size()); i++)
    {
        cout << testing[i] << endl;
    }
}

Now if I wanted to perform for example:

if(testing[i] >= 0)
{
    // blah blah
}

Then it will return an error. Any ideas, or suggestions? Is it possible to create a DFT without using Complex Numbers?

¿Fue útil?

Solución

Whoever gave you your instructions wasn't telling you to count zero crossings on the results of the DFT/FFT. That would be meaningless. (If they were telling you to do that, they were clueless. You have my permission to laugh at them for giving you such ridiculous instructions). Rather they were telling you to count zero crossings on the original data, and also look at the FFT of your data.

However,

  • Zero crossing rate is a pretty crappy starting point for speech recognition. Maybe you can get somewhere with it. With only slight hyperbole, I can say zero crossing is the least robust DSP analysis you can do. However, it is also simple, and speech recognition research has been going on a long time, so maybe there's some research on it. UPDATE/CORRECTION: this is a bit of a hyperbole. Actually I believe a lot of speech recognition techniques DO use zero-crossing, but you should know what you are doing first, because it's not very robust and sensitive to many kinds of errors like octave-errors. When you use zero-crossing, it's a good idea to low-pass (maybe aggressively) first. Definitely consider other factors.

  • Understanding the output of an FFT is something that's asked so often here that I wrote a blog entry. Usually people are trying to track pitch, and you should do that, too actually, but there's other stuff you can get from the FFT like frequency centroid, and the relative strengths of different frequencies that are important in speech. Start here: http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html

  • You might also want to consider simply filtering important speech frequencies (to find out what these are, start with wikipedia entry on "manner of articulation". For example, by following the link to Sibilant, you'll learn that "[s] has the most acoustic strength at around 8,000 Hz". Neeto!) You can get that info from an FFT or by filtering. There are advantages and disadvantages to each. You may want to look into the speech recognition literature to see what they use.

Otros consejos

Fourier transformations such as DFT return complex numbers, so you can't really get around them.

Depending on your application, you may be able to safely ignore the imaginary portion of your complex number, and treat the output of your DFT as a sequence of real numbers.

There are plenty of operations you can perform on complex numbers. Some might be relevant to your application, some not. It is worth taking some time to better understand complex numbers.

Finally, no, it is not possible to create a DFT without using complex numbers. You can take the complex output of a DFT and transform it into real numbers, but you will lose information in the process. You need to understand complex numbers and how the DFT is being used in your application to be able to determine whether or not it is appropriate to perform any such transformation.

I had a similar issue, and i gave up using a vector container for the c++ complex double numbers since it isn't well supported with FFT libs and ended up using a plain old array. You will find most of the stuff your trying to do will work just fine.

 std::complex<double>*  in=new std::complex<double> [N];

All arthmatic will work as it would with any other array for instance abs(in[i]) or in[i] *pi just make sure to use the C++ version of the math library

for your specific question you have to check the C++ reference there are real and imag functions you can use to see if it's greater then zero

then just make sure(if your using fftw)

to use reinterpret cast on all the complex numbers (input and output if they are complex)

    p = fftw_plan_dft_c2r_1d(N, reinterpret_cast<fftw_complex*>(in), out,FFTW_ESTIMATE);  

   fftw_execute(p);

A DFT will always use complex numbers, at least for its output. If the input describes some signal over time, then the output describes the signal according to frequencies. Each complex number may be written in polar form and then split into an absolute value which denotes amplitude and an angle which denotes phase. Perhaps it's the amplitudes you're interested in; if so, you'll want to compute abolute values, but they'll all be non-negative as well.

There are variations of a DFT which work on real numbers. The discrete cosine transformation comes to my mind in that respect. Not sure whether this is of any use in your application.

Note that there are libraries like FFTW which may compute the DFT faster than your code would. Even a self-written FFT might be worth considering, as long as your input size is a power of two. But all of this is a bit beside the point of your actual question.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top