For release JDK 9 and later, using the static method Character.codePointOf(String name)
is the simplest approach:
public static int codePointOf(String name)
Returns the code point value of the Unicode character specified by the given Unicode character name.
This works for all Uniocde characters, and not just those in the Basic Multilingual Plane. For example, running this code on Java 12 ...
String s1 = "LATIN SMALL LETTER A WITH DIAERESIS";
int cp1 = Character.codePointOf(s1);
System.out.println("Unicode name \"" + Character.getName(cp1) + "\" => code point " + cp1 + " => character " + Character.toString(cp1));
String s2 = "EYES";
int cp2 = Character.codePointOf(s2);
System.out.println("Unicode name \"" + Character.getName(cp2) + "\" => code point " + cp2 + " => character " + Character.toString(cp2));
String s3 = "DNA Double Helix"; // Only works with JDK12 and later. Otherwise java.lang.IllegalArgumentException is thrown.
int cp3 = Character.codePointOf(s3);
System.out.println("Unicode name \"" + Character.getName(cp3) + "\" => code point " + cp3 + " => character " + Character.toString(cp3));
...produces this output...
Unicode name "LATIN SMALL LETTER A WITH DIAERESIS" => code point 228 => character ä
Unicode name "EYES" => code point 128064 => character 👀
Unicode name "DNA DOUBLE HELIX" => code point 129516 => character 🧬
To summarize the conversions:
- For code point => Unicode name, use
Character.getName(codepoint)
- For code point => character representation, use
Character.toString(codepoint)
- For Unicode name => code point, use
Character.codePointOf(name)
- For Unicode name => character representation, no JDK method currently exists. Instead, do it indirectly, using the code point of the Unicode name, as shown above. For example:
Character.toString(Character.codePointOf("LATIN SMALL LETTER A WITH DIAERESIS"));
.
Notes:
- Be sure that the JDK release being used supports the specified Unicode names. For example, the character with the Unicode name "DNA Double Helix" was added to Unicode 11 which is only supported by JDK releases >= 12. If you run using an earlier JDK release you will get an
IllegalArgumentException
when callingCharacter.codePointOf("DNA Double Helix")
. - If a white square is being shown in place of the Unicode character then try changing the font (e.g. Segoe UI Emoji for rendering Emoji characters).