Question

I am trying to use Tamil languge in Python. But ran into difficulties. Here is my code

U=u'\u0B83'
print U

This throws the error,

UnicodeEncodeError: 'ascii' codec can't encode character u'\u0b83' in position 0 : ordinal not in range(128)

My defaultencoding in ascii. As u0b83 is already in unicode, it should print the character in Tamil.

I tried to this too, # -- coding: utf-8 --. But results are same.

How do I set this in unicode?

Was it helpful?

Solution 3

What I needed is raw-unicode-escape.

If I use encode('raw-unicode-escape').decode('utf-8') everything works perfectly. I found the answer here, Python Convert Unicode-Hex utf-8 strings to Unicode strings

OTHER TIPS

In Linux at least, you can set your locale to use UTF-8 before starting Python:

$ export LC_ALL=en_GB.utf8
$ python

You can of course use any locale with a compatible encoding (but I recommend UTF-8).

Alternatively, encode the string when outputting it:

>> print U.encode('utf-8')
ஃ

Take a look at these earlier questions and their answers:

Python, Unicode, and the Windows console

Changing default encoding of Python?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top