What format should the data be for vDSP_ctoz in iOS Accelerate framework

https://stackoverflow.com/questions/22585341

19-06-2023
|

Pergunta

I am trying to display a spectrum analyser for iOS and am stuck after two weeks. I have read pretty much every post about FFT and the Accelerate Frameworks on here and have downloaded the aurioTouch2 example from Apple.

I think I understand the mechanism of FFT (did it in Uni 20 years ago) and am a fairly experienced iOS programmer but I have hit a wall.

I am using AudioUnit to play mp3, m4a, and wav files and have that working beautifully. I have attached a Render Callback to the AUGraph and I can plot Waveforms to the music. The waveform goes with the music nicely.

Waveform image

When I take the data from the Render Callback which is in Float form in the range 0 .. 1 and attempt to pass that through the FFT code (either my own or aurioTouch2's FFTBufferManager.mm) I get something thats not completely wrong, but is not correct either. or instance this is a 440Hz sine wave:

40Khz sine wave

That peak value is -6.1306, followed by -24. -31., -35. and those values towards the end are around -63.

Animated gif for "Black Betty":

Animated gif for "Black Betty

The format I receive from the Render callback:

AudioStreamBasicDescription outputFileFormat;
outputFileFormat.mSampleRate = 44100;
outputFileFormat.mFormatID = kAudioFormatLinearPCM;
outputFileFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked | kAudioFormatFlagIsNonInterleaved;
outputFileFormat.mBitsPerChannel = 32;
outputFileFormat.mChannelsPerFrame = 2;
outputFileFormat.mFramesPerPacket = 1;
outputFileFormat.mBytesPerFrame = outputFileFormat.mBitsPerChannel / 8;
outputFileFormat.mBytesPerPacket = outputFileFormat.mBytesPerFrame;

In looking at the aurioTouch2 example it looks like they are receiving their data in a signed int format but then running an AudioConverter to convert it to Float. Their format is hard to decipher but is using a macro:

    drawFormat.SetAUCanonical(2, false);
    drawFormat.mSampleRate = 44100;

    XThrowIfError(AudioConverterNew(&thruFormat, &drawFormat, &audioConverter), "couldn't setup AudioConverter");

In their render callback they are copying the data out of the AudioBufferList into mAudioBuffer (Float32*) and passing it to the CalculateFFT method which calls vDSP_ctoz

    //Generate a split complex vector from the real data
    vDSP_ctoz((COMPLEX *)mAudioBuffer, 2, &mDspSplitComplex, 1, mFFTLength);

I think this is where my problem is. What format does vDSP_ctoz expect? It is cast as a (COMPLEX*) but I cannot find anywhere in the aurioTouch2 code which puts the mAudioBuffer data into the (COMPLEX*) format. So is must be coming from the Render Callback in this format?

typedef struct DSPComplex {
    float  real;
    float  imag;
} DSPComplex;
typedef DSPComplex                      COMPLEX;

If I don't have the format correct at this point (or understand the format) then there is no point in debugging the rest of it.

Any help would be greatly appreciated.

Code from AurioTouch2 that I am using:

Boolean FFTBufferManager::ComputeFFTFloat(Float32 *outFFTData)
{
if (HasNewAudioData())
{
    // Added after Hotpaw2 comment.
    UInt32 windowSize = mFFTLength;
    Float32 *window = (float *) malloc(windowSize * sizeof(float));

    memset(window, 0, windowSize * sizeof(float));

    vDSP_hann_window(window, windowSize, 0);

    vDSP_vmul( mAudioBuffer, 1, window, 1, mAudioBuffer, 1, mFFTLength);

    // Added after Hotpaw2 comment.
    DSPComplex *audioBufferComplex = new DSPComplex[mFFTLength];

    for (int i=0; i < mFFTLength; i++)
    {
        audioBufferComplex[i].real = mAudioBuffer[i];
        audioBufferComplex[i].imag = 0.0f;
    }

    //Generate a split complex vector from the real data
    vDSP_ctoz((COMPLEX *)audioBufferComplex, 2, &mDspSplitComplex, 1, mFFTLength);

    //Take the fft and scale appropriately
    vDSP_fft_zrip(mSpectrumAnalysis, &mDspSplitComplex, 1, mLog2N, kFFTDirection_Forward);
    vDSP_vsmul(mDspSplitComplex.realp, 1, &mFFTNormFactor, mDspSplitComplex.realp, 1, mFFTLength);
    vDSP_vsmul(mDspSplitComplex.imagp, 1, &mFFTNormFactor, mDspSplitComplex.imagp, 1, mFFTLength);

    //Zero out the nyquist value
    mDspSplitComplex.imagp[0] = 0.0;

    //Convert the fft data to dB
    vDSP_zvmags(&mDspSplitComplex, 1, outFFTData, 1, mFFTLength);

    //In order to avoid taking log10 of zero, an adjusting factor is added in to make the minimum value equal -128dB
    vDSP_vsadd( outFFTData, 1, &mAdjust0DB, outFFTData, 1, mFFTLength);
    Float32 one = 1;
    vDSP_vdbcon(outFFTData, 1, &one, outFFTData, 1, mFFTLength, 0);

    free( audioBufferComplex);
    free( window);

    OSAtomicDecrement32Barrier(&mHasAudioData);
    OSAtomicIncrement32Barrier(&mNeedsAudioData);
    mAudioBufferCurrentIndex = 0;
    return true;
}
else if (mNeedsAudioData == 0)
    OSAtomicIncrement32Barrier(&mNeedsAudioData);

return false;
}

After reading the answer below I tried adding this to the top of the method:

    DSPComplex *audioBufferComplex = new DSPComplex[mFFTLength];

    for (int i=0; i < mFFTLength; i++)
    {
        audioBufferComplex[i].real = mAudioBuffer[i];
        audioBufferComplex[i].imag = 0.0f;
    }

    //Generate a split complex vector from the real data
    vDSP_ctoz((COMPLEX *)audioBufferComplex, 2, &mDspSplitComplex, 1, mFFTLength);

And the result I got was this:

After Adding above code

I am now rendering the 5 last results, they are the faded ones behind.

After adding hann window:

enter image description here

Now looks a lot better after applying the hann window (Thanks hotpaw2). Not worried about the mirror image.

My main problem now is using a real song it doesn't look like other Spectrum Analysers. Everything is always pushed high on the left no matter what music i push through it. After applying the window it seems to go to the beat a lot better though.

Black Betty

Solução

The AU render callback only returns the real part of the complex input required. To use a complex FFT, you need to fill an equal number of imaginary components with zeros yourself, and copy over the elements of the real part, if needed.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow