Italian dected as iso-8859-2

https://stackoverflow.com/questions/12822978

python
encoding
chardet

06-07-2021
|

문제

I am using chardet to detect encoding of text files including Italian. The problem is it consistently detects their encoding as iso-8859-2 while the correct detection would be iso-8859-1. Does anybody know a fix? My local language is set to Polish? Could that influence the detection?

해결책

chardet doesn't support iso-8859-1, that's why it's not detecting it. For supported character encodings, see chardets homepage - http://pypi.python.org/pypi/chardet.

I use the Linux program 'file' to get the character encoding of different content, however I'm not sure how safe it is, see my question - Encoding detection in Python, use the chardet library or not?. But it works with great results for me so far.

Btw, your local language should not influence the detection.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow