Python: Convert a string to its binary representation [duplicate]

https://stackoverflow.com/questions/21155889

28-09-2022
|

Question

I use Python 2.7.X.
I have a text file with the following content:

\xe87\x00\x10LOL

Note that this is the text itself, and not its binary representation (meaning the first char is '\\', not 0xe8) When I read it (as binary), I get:

a = "\\\\xe87\\\\x00\\\\x10LOL"

because it is a text file.

I want to convert it to a binary form, meaning I want to get a file which begins with the characters
0xe8, 0x37, 0x00, 0x10, 0x4c, 0x4f, 0x4c.
(Note that 0x4c == 'L', 0x4f == 'O').

How do I do that?
Tried all sorts of solutions like hexlify\unhexlify, int(c, 16), but it seems like I'm missing something.
Also note that the length of the file varies, so struct.pack is less preferred.

Solution

Using string-escape or unicode-escape encoding:

>>> content = r'\xe87\x00\x10LOL'
>>> print content
\xe87\x00\x10LOL
>>> content
'\\xe87\\x00\\x10LOL'
>>> content.decode('string-escape')
'\xe87\x00\x10LOL'
>>> map(hex, map(ord, content.decode('string-escape')))
['0xe8', '0x37', '0x0', '0x10', '0x4c', '0x4f', '0x4c']

>>> bytes(map(ord, content.decode('string-escape')))
'[232, 55, 0, 16, 76, 79, 76]'

>>> bytearray(map(ord, content.decode('string-escape')))
bytearray(b'\xe87\x00\x10LOL')

OTHER TIPS

Here is one way to do it:

In [26]: a = r"\xe87\x00\x10LOL"

In [27]: b = ast.literal_eval("'" + a + "'")

In [28]: open("test.dat", "w").write(b)

In [29]: 
[1]+  Stopped                 ipython
$ xxd test.dat
0000000: e837 0010 4c4f 4c                        .7..LOL

(There are probably better tools than literal_eval, but that's the first that came to mind at this early hour in the morning.)

"".join([chr(int(i,16)) for i in data.split("\\x") if i])

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow