Question

Based on http://id3.org/id3v2.3.0 specification, the layout of the frame header is:

Frame ID       $xx xx xx xx (four characters)
Size           $xx xx xx xx
Flags          $xx xx

But same page just couple of lines below that says that frames that allow different types of text encoding have a text encoding description byte directly after the frame size. If ISO-8859-1 is used this byte should be $00, if Unicode is used it should be $01.

This is confusing, as the flags (2 bytes) should be directly after the frame size information, so I would expect the encoding byte to be after the flags information.

So now what is correct?

Frame ID       $xx xx xx xx (four characters)
Size           $xx xx xx xx
Flags          $xx xx
Encoding       $xx
Text

or

Frame ID       $xx xx xx xx (four characters)
Size           $xx xx xx xx
Encoding       $xx
Flags          $xx xx
Text
Was it helpful?

Solution

I think this might actually be a mistake case of bad wording in the spec. I found two diagrams in the ID3v2 Chapter Frame Addendum showing examples of complete headers. That document describes two newly introduced frame types, which are not interesting to the question at hand. But fortunately, it also contains examples of embedded 'Title/Songname/Content description'-frame (TIT2) and 'Subtitle/Description refinement'-frame (TIT3), which are both text frames*:

enter image description here

According to the diagram, the Title frame (ID: TIT2) has the following structure: First the frame header:

Frame ID       $xx xx xx xx (four characters)
Size           $xx xx xx xx
Flags          $xx xx

which is then directly followed by ID-dependent fields:

Text encoding  $xx Information    
<text string according to encoding>

This layout makes the most sense to me. If you still have doubts about the correct layout, you could check out the source of one of the existing implementations.

Sidenote: in the ID3v2.4.0 specification they changed the confusing sentence to.

Frames that allow different types of text encoding contains a text encoding description byte.

* Only frames that allow different types of text encoding have a text encoding description byte.
Unsurprisingly, most of these are text frames

OTHER TIPS

A frame header is 10-byte long. 4 bytes for UID 4 bytes for length of frame (header excluded) 2 bytes for flags. Any other info will be found in the frame itself, not its header.

The wording sure is confusing.

What is meant is that where you expect to read a string, the first byte tells you what to expect. $00 means ISO-8859-1 or one byte encoding $01 means Unicode or 2-byte encoding. $01 is followed by either FF FE or FE FF to inform on which the Most Significant byte is.

I'd advise you to use an hexa editor on some mp3 files and dissect them

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top