Not sure what you mean, here are some:
UTF-16
- Lead surrogate with regular unit or another lead surrogate following
\xD8\x00\x00\x00
or\xD8\x00\xDB\xFF
- Trail surrogate without lead surrogate before it
\x00\x61\xDC\00
- Trail surrogate in lead position
\xDF\xFF\xDB\xFF
- Lead surrogate as last unit
\xD8\x01<EOF>
- Lead surrogate as last unit, followed by a half trail surrogate. This bug exists in python 2.7.3:
'\xD8\x00\xDC'.decode('utf-16be')
UTF-32
- Unit value returns true for
value < 0
,value > 0x10FFFF
or0xD800 <= value && value <= 0xDFFF