I suspect that print("öäü߀".encode('L9'))
will solve your problems.
Python 2.7 Unicode Error within a function (using __future__ print_function and unicode_literals)
-
30-06-2022 - |
Question
I've read some threads about unicode now.
I am using Python 2.7.2 but with the future print_function (because the raw print statement is quite confusing for me..)
So here is some code:
# -*- coding: L9 -*-
from __future__ import print_function, unicode_literals
now if I print things like
print("öäüߧ€")
it works perfectly. However, and yes I am totally new to python, if I declare a function which shall print unicode strings it blows my script
def foo():
print("öäü߀")
foo()
Traceback (most recent call last):
File "C:\Python27\test1.py", line 7, in <module>
foo()
File "C:\Python27\test1.py", line 5, in foo
print("÷õ³▀Ç")
File "C:\Python27\lib\encodings\cp850.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\x80' in position 4: character maps to <undefined>
What's the best way to handle this error and unicode in general? And should I stick with the 2.7 print statement instead?
Solution
OTHER TIPS
This may help:
print(type(s1))
s1.encode('ascii',errors='ignore') #this works
s1.decode('ascii',errors='ignore') #this does not work
The reason is that s1.decode can't decode unicode directly so an explicit call to encode is first made, but without the errors='ignore' flag thus an error is raised
Depending on whether you were issuing your commands from a file or from a python prompt with unicode support may explain why you get an error in the latter but not the former.
Console code pages use legacy "OEM" code pages for compatibility with by old DOS console programs, while the rest of Windows uses updated code pages that support modern characters, but still differ by region. In your case the console uses cp850
and GUI programs use cp1252
. cp850
doesn't support the Euro character, so Python raises an exception when trying to print the character on the console. You can run chcp 1252
before running your script if you need the Euro to work. Make sure the console font supports the character, though.
BTW, L9
!= cp1252
either.
Are you sure printing from the console worked with a Euro? When I cut-and-paste your print
, I get the following if the code page is 850, but it works after chcp 1252
.
>>> print("öäüߧ€")
öäüߧ? # Note the ?
Encoding charts: