Python hashlib & decode() on a Bytes Object

https://stackoverflow.com/questions/11991510

26-06-2021
|

Вопрос

I'm not understanding something about hashlib. I'm not sure why I can decode a regular byte object, but can't decode a hash that's returned as a byte object. I keep getting this error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 1: invalid start byte

Here's my test code that's producing this error. The error is on line 8 (h2 = h.decode('utf-8'))

import hashlib

pw = 'wh@teV)r'
salt = 'b7u2qw^T&^#U@Lta)hvx7ivRoxr^tDyua'
pwd = pw + salt
h = hashlib.sha512(pwd.encode('utf-8')).digest()
print(h)
h2 = h.decode('utf-8')
print(h2)

If I don't hash it, it works perfectly fine...

>>> pw = 'wh@teV)r'
>>> salt = 'b7u2qw^T&^#U@Lta)hvx7ivRoxr^tDyua'
>>> pwd = pw + salt
>>> h = pwd.encode('utf-8')
>>> print(h)
b'wh@teV)rb7u2qw^T&^#U@Lta)hvx7ivRoxr^tDyua'
>>> h2 = h.decode('utf-8')
>>> print(h2)
wh@teV)rb7u2qw^T&^#U@Lta)hvx7ivRoxr^tDyua

So I'm guessing I'm not understanding something about the hash, but I have no clue what I'm missing.

Решение

In the second example you're just encoding to UTF-8 and then decoding the result straight back.

In the first example, on the other hand, you're encoding to UTF-8, messing about with the bytes, and then trying to decode it as though it's still UTF-8. Whether the resulting bytes are still valid as UTF-8 is purely down to chance (and even if it is still valid UTF-8, the Unicode string it represents will bear no relation to the original string).

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow