Is there way to check charset encoding of .txt file with Java?
-
14-06-2021 - |
Question
Is there way to check is text file (.txt) encoded with Unicode or UTF-8 with Java?
Solution
You cannot know with absolute certainty which charset is used in the general case. I found this to be a good read.
http://illegalargumentexception.blogspot.co.uk/2009/05/java-rough-guide-to-character-encoding.html
Especially the section Automatic detection of encoding.
OTHER TIPS
Uhm, theoretically, how would you know if it is unicode?
This is the real question. Truthfully, you cannot know, but you can make a decent guess.
See: Java : How to determine the correct charset encoding of a stream for more details. :)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow