سؤال

I've a UTF-8(in literal) like this "\xE2\x80\x93."

I'm trying to convert this into Unicode using Java.

But I was not able to find a way to convert this.

Can anyone help me on this?

Regards, Sat

هل كانت مفيدة؟

المحلول

System.out.println(new String(new byte[] {
    (byte)0xE2, (byte)0x80, (byte)0x93 }, "UTF-8"));

prints an em-dash, which is what those three bytes encode. It is not clear from your question whether you have such three bytes, or literally the string you have posted. If you have the string, then simply parse it into bytes beforehand, for example with the following:

final String[] bstrs = "\\xE2\\x80\\x93".split("\\\\x");
final byte[] bytes = new byte[bstrs.length-1];
for (int i = 1; i < bstrs.length; i++)
  bytes[i] = (byte) ((Integer.parseInt(bstrs[i], 16) << 24) >> 24);
System.out.println(new String(bytes, "UTF-8"));

نصائح أخرى

You can use the Apache Commons Lang StringEscapeUtils

Or if you do know that the string will always be \xHH\xHH then you can:

String hex = input.replace("\x", "");
byte[] bytes = hexStringToByteArray(hex);
String result = new String(bytes, "utf-8");

hexStringToByteArray is here.

Also see this other SO answer.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top