Question

I'm writing a C++ library for decoding and encoding audio between different formats/codecs. I have a routine for quickly detecting the format before loading the required codec library.

For WAV files one can simple look for the ASCII values "RIFF" and "WAVE" at the start of the file. The same applies to FLAC, we can simply read in the first 4 bytes, which will be "fLaC".

But how can I quickly detect if a file is MP3? I can't rely on the file extension. I also can't try to decode the first MP3 frame, since there might be additional data at the start of the file (eg: ID3, cover image, etc).

Was it helpful?

Solution

Detecting if a file is an MP3 is more complicated than searching for a fixed pattern in the file.

Some concepts

(See http://www.codeproject.com/Articles/8295/MPEG-Audio-Frame-Header for details)

  • MP3 file consists of a series of frames and each frame has a header at the beginning.
  • Header starts at a byte boundary with an 11-bit sync word, which is all 1s. Hence the sync word is either 0xFFE or 0XFFF.
  • Length of each frame is calculated based on the header parameters.

Algorithm to determine if a file is MP3 or not

  • Search for the sync word in the file (0xFFF or 0xFFE).
  • Parse the header parameters.
  • Determine the frame length using the header parameters.
  • Seek to the next frame using the frame length.
  • If you find another sync word after seeking, then the file is mostly an MP3 file.
  • To be sure, repeat the process to find N consecutive MP3 frames. N can be increased for a better hit-rate.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top