Your only concern with contents of passwords should be about Unicode normalization. There are subtle differences in the Unicode symbols generated by different operating systems — for example, some might encode letters like "à" as a single character (U+00E0
), while others might produce "à" (two characters: the plain latin letter a
, followed by the combining grave accent character U+0300
). You should normalize Unicode passwords before hashing them in order to make sure that when your users type their passwords on different operating systems, such differences would not prevent them from gaining access to their accounts.
>>> a1 = u'à'
>>> a2 = u'à'
>>> a1
u'\xe0'
>>> a2
u'a\u0300'
>>> a1 == a2
False
>>> from unicodedata import normalize
>>> normalize('NFC', a1) == normalize('NFC', a2)
True