Python 3 - print utf-8 encoded data into console (not "\x00(\x00A\x04" )

https://stackoverflow.com/questions/20375202

console
python-3.x
utf

29-08-2022
|

Question

r = requests.get('...', allow_redirects=True)
pagetext = r.text
tree = etree.HTML(pagetext)
node = tree.xpath('...')[0]
out = str(etree.tostring(node, method='text', encoding='UTF8'))
print(out) // some "\x00(\x00A\x04>\x042\x04<\x045\x04A\"-like thing is printed

I've tried various .encode('UTF-8') on defferent parts of strings but it's still no luck :(

Solution

That's not UTF-8.

3>> b"\x00(\x00A\x04>\x042\x04<\x045\x04A".decode('utf-16be')
'(Aовмес'

Note that "utf-16be" was chosen based on your sample data; it is more likely to be UTF-16LE instead.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow