Question

I am trying to access the raw data for an audio file on the iPhone/iPad. I have the following code which is a basic start down the path I need. However I am stumped at what to do once I have an AudioBuffer.

AVAssetReader *assetReader = [AVAssetReader assetReaderWithAsset:urlAsset error:nil];
AVAssetReaderTrackOutput *assetReaderOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:[[urlAsset tracks] objectAtIndex:0] outputSettings:nil];
[assetReader addOutput:assetReaderOutput];
[assetReader startReading];

CMSampleBufferRef ref;
NSArray *outputs = assetReader.outputs;
AVAssetReaderOutput *output = [outputs objectAtIndex:0];
int y = 0;
while (ref = [output copyNextSampleBuffer]) {
    AudioBufferList audioBufferList;
    CMBlockBufferRef blockBuffer;
    CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(ref, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
    for (y=0; y<audioBufferList.mNumberBuffers; y++) {
        AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
        SInt16 *frames = audioBuffer.mData;
        for(int i = 0; i < 24000; i++) { // This sometimes crashes
            Float32 currentFrame = frames[i] / 32768.0f;
        }
    }
}

Essentially I don't know how to tell how many frames each buffer contains so I can't reliably extract the data from them. I am new to working with raw audio data so I'm open to any suggestions in how to best read the mData property of the AudioBuffer struct. I also haven't done much with void pointers in the past so help with that in this context would be great too!

Was it helpful?

Solution

audioBuffer.mDataByteSize tells you the size of the buffer. Did you know this? Just incase you didn't you can't have looked at the declaration of struct AudioBuffer. You should always look at the header files as well as the docs.

For the mDataByteSize to make sense you must know the format of the data. The count of output values is mDataByteSize/sizeof(outputType). However, you seem confused about the format - you must have specified it somewhere. First of all you treat it as a 16bit signed int

SInt16 *frames = audioBuffer.mData

then you treat it as 32 bit float

Float32 currentFrame = frames[i] / 32768.0f

inbetween you assume that there are 24000 values, of course this will crash if there aren't exactly 24000 16bit values. Also, you refer to the data as 'frames' but what you really mean is samples. Each value you call 'currentFrame' is one sample of the audio. 'Frame' would typically refer to a block of samples like .mData

So, assuming the data format is 32bit Float (and please note, i have no idea if it is, it could be 8 bit int, or 32bit Fixed for all i know)

for( int y=0; y<audioBufferList.mNumberBuffers; y++ )
{
  AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
  int bufferSize = audioBuffer.mDataByteSize / sizeof(Float32);
  Float32 *frame = audioBuffer.mData;
  for( int i=0; i<bufferSize; i++ ) {
    Float32 currentSample = frame[i];
  }
}

Note, sizeof(Float32) is always 4, but i left it in to be clear.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top