Question

I just have the string \u0130smail and I want to convert it to İsmail and also convert

  \u0130 --> İ   
  \u00E7 --> ç

I tried

String str = "\u0130smail";
sysout(str); 

and it worked, but whenever I get the string "\u0130smail" from the DB or the internet it doesn't give the correct result.

static String deneme(String string){
    String string2 = null;

    try {
        byte[] utf8 = string.getBytes("UTF-8");
        string2 = new String(utf8, "UTF-8");
    } catch (UnsupportedEncodingException e) {
    }
    return string2;
}

didn't work either.

Was it helpful?

Solution

Strings "\u0130smail" and "İsmail" are absolutely the same from the language standpoint. If you mean that you get a string "\\u0130smail" (note that I've escaped the backslash), then you will have to find the pattern of the unicode code points and convert them to normal unicode letters or just print the number, whichever you need. Regular expressions could help you in this case.

OTHER TIPS

Converting the existing string to bytes and back again isn't going to help you. You need to look at the exact characters in the string you've got - and work out how you got them.

I suggest you print out the integer value of each character in the string as an integer (ideally in hex) to find out exactly what you've got... then trace it back as far as you can, to work out what's going wrong.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top