Is there a way to infer what image format a file is, without reading the entire file?

StackOverflow https://stackoverflow.com/questions/52739

  •  09-06-2019
  •  | 
  •  

Question

Is there a good way to see what format an image is, without having to read the entire file into memory?

Obviously this would vary from format to format (I'm particularly interested in TIFF files) but what sort of procedure would be useful to determine what kind of image format a file is without having to read through the entire file?

BONUS: What if the image is a Base64-encoded string? Any reliable way to infer it before decoding it?

Was it helpful?

Solution

Most image file formats have unique bytes at the start. The unix file command looks at the start of the file to see what type of data it contains. See the Wikipedia article on Magic numbers in files and magicdb.org.

OTHER TIPS

Sure there is. Like the others have mentioned, most images start with some sort of 'Magic', which will always translate to some sort of Base64 data. The following are a couple examples:

A Bitmap will start with Qk3

A Jpeg will start with /9j/

A GIF will start with R0l (That's a zero as the second char).

And so on. It's not hard to take the different image types and figure out what they encode to. Just be careful, as some have more than one piece of magic, so you need to account for them in your B64 'translation code'.

Either file on the *nix command-line or reading the initial bytes of the file. Most files come with a unique header in the first few bytes. For example, TIFF's header looks something like this:

0x00000000: 4949 2a00 0800 0000
For more information on the TIFF file format specifically if you'd like to know what those bytes stand for, go here.

A comprehensive site of file formats is available at:

http://www.wotsit.org

TIFFs will begin with either II or MM (Intel byte ordering or Motorolla).
The TIFF 6 specification can be downloaded here and isn't too hard to follow

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top