Domanda

This is more an educational question about character encoding (hobbyist coder here) - but I have encountered this specific problem:

1) I wrote a silly program in python 3, in the editor I am using ALWAYS UTF-8 (german keyboard here).

2) To share my "creation" with interested family-members... I pasted the code into a private Pastebin.com (with instructions: copy/paste (raw) file into text-file and change file extension to .py)

Here the trouble starts:

3) Following these instructions will not allow to run the program.

4) I am not sure why it doesn't work, but since the character-encoding is now ANSI I know this is the problem. Changing the encoding back to UTF-8 in a code-editor solves the problem.

The questions are:

a) Why does it change to ANSI?

b) Why doesn't it work anyway in ANSI (since just by eye the whole code looks the same)

c) How to conserve the UTF-8 encoding? I mean: my family doesn't know how to change the encoding... (I know... just send them the executable file. But as I said... educational)

edit: clarified python-3.x version

È stato utile?

Soluzione

When they copy paste the text to editor and press save, that's where the trouble starts. When saving, you must specify encoding, or be at the mercy of some default like "ANSI". A text file cannot be saved without using some kind of encoding.

It might not work because you have declared # -*- coding: utf-8 -*- in your file, the text editor saving the file in "ANSI" has no knowledge of this and leave it as is of course. So the parser will try to read the file as UTF-8 and fail because it's not UTF-8.

You could just use \uxxxx escapes for non-ASCII characters in the source.

So instead of:

str = u"€"

Do

str = u"\u20AC"
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top