Question

I'm trying to create a program to get specific EXIF information of a jpeg without using PIL and such. I'm reading the file in binary but the output is slightly confusing...

file = open("/Users/Niko/Desktop/IMG.JPG", "rb")
print(file.read(16))

Which outputs:

b'\xff\xd8\xff\xe1/\xfeExif\x00\x00MM\x00*\x00\x00\x00\x08\x00\x0b\x01\x0f\x00\x02\x00\x00\x00\x06\x00\x00'

What I'm confused about is what the "\","/", and "*" mean... I know that the first few bytes that signify its a jpeg is 0xFF 0xD8, so I take it the \s are 0s? Can anyone help me understand this?

Apologies for any beginners mistakes, new to coding in general and kind of just jumped in to creating this program.

Was it helpful?

Solution

Python presents you with a representation of the byte string that you can copy and paste into a Python interpreter again.

In order to make it readable and survive pasting into something that doesn't handle raw bytes, anything that isn't printable is escaped using a Python byte escape code, \xHH, representing the hexademical value for a byte.

Anything that is printable, is represented as the ASCII character directly. A hex byte 0x41 is the letter A (capital) in ASCII, and printed as such:

>>> b'\x41'
b'A'

Thus, * is hex 2A, / is hex 2F:

>>> hex(ord(b'*'))
'0x2a'
>>> hex(ord(b'/'))
'0x2f'

You could use binascii.hexlify() to generate an all-hexadecimal representation of your bytes:

>>> from binascii import hexlify
>>> hexlify(b'\xff\xd8\xff\xe1/\xfeExif\x00\x00MM\x00*\x00\x00\x00\x08\x00\x0b\x01\x0f\x00\x02\x00\x00\x00\x06\x00\x00')
b'ffd8ffe12ffe4578696600004d4d002a00000008000b010f0002000000060000'

That said, you would be better off installing Pillow (the modernized fork of the Python Image Library) and have it handle JPEG images, including extracting EXIF information, for you.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top