Why does the string “¿” get translated to “¿” when calling .getBytes()
-
05-07-2019 - |
Question
When writing the string "¿" out using
System.out.println(new String("¿".getBytes("UTF-8")));
¿ is written instead of just ¿.
WHY? And how do we fix it?
Solution
You don't have to use UTF-16 to solve this:
new String("¿".getBytes("UTF-8"), "UTF-8");
works just fine. As long as the encoding given to the getBytes()
method is the same as the encoding you pass to the String constructor, you should be fine!
OTHER TIPS
You need to specify the Charset in the String constructor (see the API docs).
Try:
System.out.println(new String("¿".getBytes("UTF-8"), "UTF-8"));
You need to specify the encoding both when converting the string to bytes and when converting the bytes back to a string.
Sounds like the system console isn't in UTF-8
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow