Looks like a bug in PyPDF2. In this section:
if string.startswith(codecs.BOM_UTF16_BE):
retval = TextStringObject(string.decode("utf-16"))
retval.autodetect_utf16 = True
it assumes that any string starting with (0xFE, 0xFF) can be decoded as UTF-16. Your file contains a bytestring that begins that way but then contains invalid UTF-16.
The simplest fix is to comment out that if
and unconditionally use the # This is probably a big performance hit here
branch.