You should try 'cp1250' as encoding:
import codecs
content = None
with codecs.open('file-name', 'r', encoding='cp1250') as f:
content = f.read()
print(content)
if this fails, you may also try ISO-8859-2 encoding
Question
I've got a problem with the encoding type of the file that i'm importing ( it contains polish special characters ). How do I make it work?
The error says:
Traceback (most recent call last):
File "D:/Users/Denis/Dysk Google/scripts/python/napisy/napisy", line 6, in <module>
str = inputfile.read() #name for the file
File "D:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 2: character maps to <undefined>
part that there is a problem with:
inputfilename = "a.txt"
outputfilename = inputfilename[0:-4]+"_fixed"+".txt"
inputfile = open(inputfilename, 'r')
str = inputfile.read() #name for the file
newstring = str.replace("œ", "s").replace("ê","e").replace("³","l").replace("¹","a").replace("¿","z").replace("ñ","n").replace("Ÿ","z").replace("æ","c")
outputfile = open(outputfilename, "w")
outputfile.write(newstring)
outputfile.close()
Solution
You should try 'cp1250' as encoding:
import codecs
content = None
with codecs.open('file-name', 'r', encoding='cp1250') as f:
content = f.read()
print(content)
if this fails, you may also try ISO-8859-2 encoding