Writing and then reading a string in file encoded in latin1

Question

Your data was written out as UTF-8:

>>> 'On écrit ça dans un fichier.'.encode('utf8').decode('latin1')
'On Ã©crit Ã§a dans un fichier.'

This either means you did not write out Latin-1 data, or your source code was saved as UTF-8 but you declared your script (using a PEP 263-compliant header to be Latin-1 instead.

If you saved your Python script with a header like:

# -*- coding: latin-1 -*-

but your text editor saved the file with UTF-8 encoding instead, then the string literal:

s='On écrit ça dans un fichier.'

will be misinterpreted by Python as well, in the same manner. Saving the resulting unicode value to disk as Latin-1, then reading it again as Latin-1 will preserve the error.

To debug, please take a close look at print(s.encode('unicode_escape')) in the first script. If it looks like:

b'On \\xc3\\xa9crit \\xc3\\xa7a dans un fichier.'

then your source code encoding and the PEP-263 header are disagreeing on how the source code should be interpreted. If your source code is correctly decoded the correct output is:

b'On \\xe9crit \\xe7a dans un fichier.'

If Spyder is stubbornly ignoring the PEP-263 header and reading your source as Latin-1 regardless, avoid using non-ASCII characters and use escape codes instead; either using \uxxxx unicode code points:

s = 'On \u00e9crit \u007aa dans un fichier.'

or \xaa one-byte escape codes for code-points below 256:

s = 'On \xe9crit \x7aa dans un fichier.'