سؤال

I am reviewing legacy code that handles different tpyes of images, including JPEG. The legacy code uses a 4 byte sequence to determine if a byte[] is a valid JPEG. Specifically:

0xFF 0xD8 0xFF and either 0xE0 or 0xE1.

When I did my research I found that all you need is 0xFF 0xD8 0xFF, the 0xE0 or 0xE1 are part of what is known as the APP segment (http://en.wikipedia.org/wiki/JPEG#Syntax_and_structure). This segment is application specific and other research I have done indicates there are at least 4 other values possible in this segment (0xE2, 0xE3, 0xE8 and 0xED).

I thought a JPEG is a JPEG is a JPEG. That given a file with any one of the 6 known/allowed APP segments that any device that can display a JPEG would be able to display the given file. Is this correct?

Is there a good reason for filtering based on particular values of the APP segment?

How exactly does the APP segment effect the JPEG image? Is it used at all or is it application specific data that is only used when the displaying application recognizes their APP value? For instance I read that 0xED is used by Photoshop. So if the image was being displayed by Photoshop then the data in the APP segment is meaningful - but to any other app that is not Photoshop the APP segment is ignored and the image is displayed just fine.

NOTE: in the end a Java applet will be displaying the JPEG.

هل كانت مفيدة؟

المحلول

I thought a JPEG is a JPEG is a JPEG.

Actually, most files referred to as "a JPEG file" are either JFIF or Exif. :-)

Exif uses the structure of JFIF, so you can parse them just the same. But because JFIF specifies that the first APP segment must be APP0/JFJF, and Exif says that for Exif the first APP segment must be APP1/Exif, they are not really compatible. Some JFIFs contain Exif APP segments in a later segment, to use it for metadata. Some "JPEG"s contains neither Exif or JFIF APP segment, but still contain valid JPEG code streams. Most software glosses over this fact though.

Is there a good reason for filtering based on particular values of the APP segment?

Depends. For example, if you want to filter out Exif only, or ISO JPEG only, then yes. If you want to read as many "JPEG"s as possible, then you obviously don't want this.

Some software (ie. default Java JPEGImageReaderSpi used by ImageIO, as you mention Java) uses just the SOI marker (0xFF, 0xD8) to identify JPEG. Making sure the next byte is 0xFF is of course an extra precaution, to filter out false positives.

How exactly does the APP segment effect the JPEG image?

Some APP segments effect how the compressed JPEG data is to be interpreted. Most JPEG reading software needs to be aware of at least APP0/JFIF, APP1/Exif, APP2/ICC_PROFILE, APP14/Adobe to properly interpret and convert color from the compressed data. Ignoring these, will most likely produce images with strange-looking or inaccurate colors.

Other segments, like the APP0/JFXX (thumbnail extension), APP13/Photoshop 3.0 and APP1/XMP tags are used mainly for metadata, and can probably be ignored.

Also note that the APPn segments start with a null-terminated ASCII string after the APPn marker, to fully identify the APP segment type. It's not enough to just look at the marker.

PS: To read JPEGs in Java, you might want to have a look at my TwelveMonkeys ImageIO library, to expand the number of "JPEG" varieties ImageIO can read.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top