Question

I am trying to load an ID3 image tag that has been saved in UTF-16 JFIF format. The library I am using (Juce) fails to parse the image, as it assumes that the data is in a raw binary format.

The majority of image tags I've parsed successfully report the encoding as ISO-8859 (latin-1), but because latin-1 is a subset of UTF-16 a conversion wouldn't work.

How can I get this UTF16 encoded binary block in the raw format that I want? And could anybody enlighten me as to the benefits of storing an image in UTF16 format?!

Was it helpful?

Solution

latin1 is not a subset of UTF-16!

I think you misunderstood text encoding and binary encoding. UTF-16 is used for character encoding, the base unit is a 16-bits integer (UTF-8 is using 8 bits integer).

A JPEG picture (JFIF) is binary encoded, and its data should never get converted via character encoding algorithm.

If you actually did so, you're out of luck, since using a character conversion algorithm on a binary stream depends on whatever "source" text charset that was used at the time.

You can probably try to convert that (UTF-16) binary data back to binary by guessing the initial source charset, using iconv.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top