As @Martijn suggests, decoding your original file correctly would be a better solution. If your file is Hebrew but displays array
characters, it is probably being displayed as latin1
or cp1252
encoding. cp1255
looks like a close match. Perhaps your array1
isn't quite right. Also note strings are iterable so you can simplify your arrays:
# coding: utf8
array = u'àáâãäåæçèéêëìíîïðñóôõöøùúûüýþÿ'
array1 = u'אבגדהוזחטיךכלםמןנסעףפץצקרשת'
print(array)
print(array1)
print(array.encode('cp1252').decode('cp1255',errors='replace'))
The last line above reverses the "incorrect" encoding and decodes it with cp1255
(a Hebrew encoding) instead. Output:
àáâãäåæçèéêëìíîïðñóôõöøùúûüýþÿ
אבגדהוזחטיךכלםמןנסעףפץצקרשת
אבגדהוזחטיךכלםמןנסףפץצרשת���
It's not a perfect match, but close enough that I think your original file was encoded with cp1255
.