Domanda

i'm new to python and i need some hand to work this code:

this code works right, it converts strings as i need.

# -*- coding: utf-8 -*-
import sys
import arabic_reshaper
from bidi.algorithm import get_display

reshaped_text = arabic_reshaper.reshape(u' الحركات')
bidi_text = get_display(reshaped_text)
print >>open('out', 'w'), reshaped_text.encode('utf-8') # This is ok

I get the following error when i try to read the string from a file:

# -*- coding: utf-8 -*-
import sys
import arabic_reshaper
from bidi.algorithm import get_display

with open ("/home/nemo/Downloads/mpcabd-python-arabic-reshaper-552f3f4/data.txt" , "r") as myfile:
data=myfile.read().replace('\n', '')    
reshaped_text = arabic_reshaper.reshape(data)
bidi_text = get_display(reshaped_text)
print >>open('out', 'w'), reshaped_text.encode('utf-8')

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 0: ordinal not in range(128).

Any hand

Thanks

È stato utile?

Soluzione

The method decode() decodes the string using the codec registered for encoding. It defaults to the default string encoding.

When you reading utf-8 encoded file, you need to use string.decode('utf8')

Write:

data = 'my data'
with open("file.txt" , "w") as f:
    f.write(data.encode('utf-8'))

Read:

with open("file.txt" , "r") as f:
    data = f.read().decode('utf-8')

Altri suggerimenti

You can also use the optional encoding parameter of the built-in open function:

with open("/home/nemo/Downloads/mpcabd-python-arabic-reshaper-552f3f4/data.txt",
          'rt',
          encoding='utf8') as f:
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top