Python ignores encoding argument in favor of cp1252

https://stackoverflow.com/questions/22149149

19-10-2022
|

Question

I have a lengthy json file that contains utf-8 characters (and is encoded in utf-8). I want to read it in python using the built-in json module.

My code looks like this:

dat = json.load(open("data.json"), "utf-8")

Though I understand the "utf-8" argument should be unnecessary as it is assumed as the default. However, I get this error:

Traceback (most recent call last):
  File "winratio.py", line 9, in <module>
    dat = json.load(open("data.json"), "utf-8")
  File "C:\Python33\lib\json\__init__.py", line 271, in load
    return loads(fp.read(),
  File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 28519: ch
aracter maps to <undefined>

My question is: Why does python seem to ignore my encoding specification and try to load the file in cp1252?

No correct solution

OTHER TIPS

Try this:

import codecs

dat = json.load(codecs.open("data.json", "r", "utf-8"))

Also here are described some tips about a writing mode in context of the codecs library: Write to UTF-8 file in Python

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow