Question

I have found a number of resources but nothing that has quite helped me with what I am looking for. I am trying to understand the .png and.jpg file formats enough to be able to modify and/or read the exif or other meta data in the files or to create my own meta data if possible.

I want to do this in the context of an android application so we can keep if there but it is really not exclusive to that. I am trying to figure out how to do this using a simple imput stream byte array and go from there.

Android itself has to at least extract the RGB pixel information at some point when it creates a bmp image from the stream, I took a look in the BitMapFactory source to try and understand it but I got lost somewhere after delving into the Native files.

I assume the bmps are losing any exif/meta data in the files based on my research. So I guess I want to break the inputstreams down by byte arrays and remove meta data. In .pngs I know there is no 'standard' but based on this page it seems there is some organization of the meta data you can store.

With all that said, I wouldn't mind just leaving exif/png standards behind and trying to store my own information in some sort of standardized way, but I need to know more about how the image readers id the files as either jpg, png, ect. then determine where the pixel information is located.

So I guess my first question is, has anyone done something similar to this before so that they can file me in? If not, does anyone know of any good libraries that might be good for educational purposes into figuring out how to locate and extract this data?

Or even more basically, what is a good way to find meta data and/or the exif standard or even the rgb data programmatically using something like a byte array?

Was it helpful?

Solution

There are a lot of things to address in your question, but first I should clarify that when you say "Android itself has to at least extract the RGB pixel information," what you're referring to is the act of decompression, which is complicated in the case of JPEG, and non trivial even for PNG. I think it would be very useful for you to read through the wikipedias for JPEG and PNG before attempting to go any further (especially sections on header, syntax, file structure, etc).

That being said, you've got the right idea. It shouldn't be too difficult to read in the header of an image as a byte array/stream, make some changes, and replace the old file. A PNG file can be identified by the first 8 bytes, and there should be a similar way to identify a JPEG - I can't remember off the top of my head.

To modify PNG meta data, you'll have to understand "chunks" - types/names, ordering, format, CRC, etc. The libpng website has some good resources for this, here's general PNG info, as well as chunk specifications. Make sure you don't forget to recalculate the CRC if you change anything.

JPEG sections off a file using "markers," which are two bytes long and always start with FF. Exif is just a regular JPEG file with a more specific structure for meta data, and this seems like a reasonable introduction: Exit/TIFF

There are probably libraries for Android/Java that conveniently take care of this for you, but I've never used any myself. A quick google search turns up this, and I'm sure there are many other options if you don't want to take the time to write a parser yourself.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top