Question

So, I'm trying to do a simple calculation over previously recorded audio (from an AVAsset) in order to create a wave form visual. I currently do this by averaging a set of samples, the size of which is determined by dividing the audio file size by the resolution I want for the wave form.

This all works fine, except for one problem....it's too slow. Running on a 3GS, processing an audio file takes about 3% of the time it takes to play it, which is way to slow (for example, a 1 hour audio file takes about 2.5 minutes to process). I've tried to optimize the method as much as possible but it's not working. I'll post the code I use to process the file. Maybe someone will be able to help with that but what I'm really looking for is a way to process the file without having to go over every single byte. So, say given a resolution of 2,000 I'd want to access the file and take a sample at each of the 2,000 points. I think this would be a lot quicker, especially if the file is larger. But the only way I know to get the raw data is to access the audio file in a linear manner. Any ideas? Here's the code I use to process the file (note, all class vars begin with '_'):

So I've completely changed this question. I belatedly realized that AVAssetReader has a timeRange property that's used for "seeking", which is exactly what I was looking for (see original question above). Furthermore, the question has been asked and answered (I just didn't find it before) and I don't want to duplicate questions. However, I'm still having a problem. My app freezes for a while and then eventually crashes when ever I try to copyNextSampleBuffer. I'm not sure what's going on. I don't seem to be in any kind of recursion loop, it just never returns from the function call. Checking the logs show give me this error:

Exception Type:  00000020
Exception Codes: 0x8badf00d
Highlighted Thread:  0

Application Specific Information:
App[10570] has active assertions beyond permitted time: 
{(
    <SBProcessAssertion: 0xddd9300> identifier: Suspending process: App[10570] permittedBackgroundDuration: 10.000000 reason: suspend owner pid:52 preventSuspend  preventThrottleDownCPU  preventThrottleDownUI 
)}

I use a time profiler on the app and yep, it just sits there with a minimal amount of processing. Can't quite figure out what's going on. It's important to note that this doesn't occur if I don't set the timeRange property of AVAssetReader. I've checked and the values for timeRange are valid, but setting it is causing the problem for some reason. Here's my processing code:

- (void) processSampleData{
    if (!_asset || CMTimeGetSeconds(_asset.duration) <= 0) return;
    NSError *error = nil;
    AVAssetTrack *songTrack = _asset.tracks.firstObject;
    if (!songTrack) return;
    NSDictionary *outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
                                        [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
                                        [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
                                        nil];

    UInt32 sampleRate = 44100.0; 
    _channelCount = 1;

    NSArray *formatDesc = songTrack.formatDescriptions;
    for(unsigned int i = 0; i < [formatDesc count]; ++i) {
        CMAudioFormatDescriptionRef item = (__bridge_retained CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
        const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
        if(fmtDesc ) { 
            sampleRate = fmtDesc->mSampleRate;
            _channelCount = fmtDesc->mChannelsPerFrame;
        }
        CFRelease(item);
    }

    UInt32 bytesPerSample = 2 * _channelCount; //Bytes are hard coded by AVLinearPCMBitDepthKey
    _normalizedMax = 0;
    _sampledData = [[NSMutableData alloc] init];

    SInt16 *channels[_channelCount];
    char *sampleRef;
    SInt16 *samples;
    NSInteger sampleTally = 0;
    SInt16 cTotal;
    _sampleCount = DefaultSampleSize * [UIScreen mainScreen].scale;
    NSTimeInterval intervalBetweenSamples = _asset.duration.value / _sampleCount;
    NSTimeInterval sampleSize = fmax(100, intervalBetweenSamples / _sampleCount);
    double assetTimeScale = _asset.duration.timescale;
    CMTimeRange timeRange = CMTimeRangeMake(CMTimeMake(0, assetTimeScale), CMTimeMake(sampleSize, assetTimeScale));

    SInt16 totals[_channelCount];
    @autoreleasepool {
        for (int i = 0; i < _sampleCount; i++) {
            AVAssetReader *reader = [AVAssetReader assetReaderWithAsset:_asset error:&error];
            AVAssetReaderTrackOutput *trackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:songTrack outputSettings:outputSettingsDict];
            [reader addOutput:trackOutput];
            reader.timeRange = timeRange;
            [reader startReading];
            while (reader.status == AVAssetReaderStatusReading) {
                CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];
                if (sampleBufferRef){
                    CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);
                    size_t length = CMBlockBufferGetDataLength(blockBufferRef);
                    int sampleCount = length / bytesPerSample;
                    for (int i = 0; i < sampleCount ; i += _channelCount) {
                        CMBlockBufferAccessDataBytes(blockBufferRef, i * bytesPerSample, _channelCount, channels, &sampleRef);
                        samples = (SInt16 *)sampleRef;
                        for (int channel = 0; channel < _channelCount; channel++)
                            totals[channel] += samples[channel];
                        sampleTally++;
                    }
                    CMSampleBufferInvalidate(sampleBufferRef);
                    CFRelease(sampleBufferRef);
                }
            }
            for (int i = 0; i < _channelCount; i++){
                cTotal = abs(totals[i] / sampleTally);
                if (cTotal > _normalizedMax) _normalizedMax = cTotal;
                [_sampledData appendBytes:&cTotal length:sizeof(cTotal)];
                totals[i] = 0;
            }
            sampleTally = 0;
            timeRange.start = CMTimeMake((intervalBetweenSamples * (i + 1)) - sampleSize, assetTimeScale); //Take the sample just before the interval
        }

    }
    _assetNeedsProcessing = NO;
}
Was it helpful?

Solution

I finally figured out why. Apparently there is some sort of 'minimum' duration you can specify for the timeRange of an AVAssetReader. I'm not sure what exactly that minimum is, somewhere above 1,000 but less than 5,000. It's possible that the minimum changes with the duration of the asset...honestly I'm not sure. Instead, I kept the duration (which is infinity) the same and simply changed the start time. Instead of processing the whole sample, I copy only one buffer block, process that and then seek to the next time. I'm still having trouble with the code, but I'll post that as another question if I can't figure it out.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top