Question

The code below gives me the Unicode string as கா

sysout = new PrintStream(System.out, true, "UTF-8");
sysout.println("\u0B95\u0bbe");

By giving கா as input, can I get the hex values as \u0B95 and \u0bbe?

PS: This is Tamil language.

Was it helpful?

Solution 2

According to this you'll have to try

System.out.println( "\\u" + Integer.toHexString('க' | 0x10000).substring(1) );

but it will only work on Unicode up to 3.0. If you want to get more values, just create a loop, e.g.

String foo = "கா";
for (int i = 0; i < foo.length(); i++)
    System.out.println( "\\u" + Integer.toHexString(foo.charAt(i) | 0x10000).substring(1));

which produces

\u0b95
\u0bbe

If you want to have them in one line, change System.out.println() to System.out.print() and add System.out.print("\n") in the end.

OTHER TIPS

You can use the format functionality to print the Java UTF-16 string escapes.

For example, this code writes the escapes to STDOUT:

String str = "கா";
for(char ch : str.toCharArray())
   System.out.format("\\u%04x", (int) ch);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top