Question

I am receiving a string text via USB communication in android in form of extended ASCII characters like

String receivedText = "5286T11ɬ ªË ¦¿¯¾ ¯¾ ɬ ¨¬°:A011605286 ª¿ª ¾®:12:45 ¸Í®°:(9619441121)ª¿ª:-, ®¹¿¦Í°¾ ¡ ®¹¿¦Í°¾ ª¨À, ¾¦¿µ²À ¸Í, ¾¦¿µ²À ªÂ°Íµ °¿®¾°Í͸:- ¡Í°Éª:-, ¬¾¹°, ¸¾¤¾Í°Â¼ ªÂ°Íµ~";

Now these character represents a string in hindi.

I am not getting how to convert this received string into hindi equivalent text. Any one knows how to convert this into equivalent hindi text using java

Following is the piece of code which I am using to convert byte array to byte string

public String byteArrayToByteString(byte[] arayValue, int size) {
        byte ch = 0x00;
        int i = 0;

        if (arayValue == null || arayValue.length <= 0)
            return null;

        String pseudo[] = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9",
                "A", "B", "C", "D", "E", "F" };
        StringBuffer out = new StringBuffer();

        while (i < size) {

            ch = (byte) (arayValue[i] & 0xF0); // Strip off high nibble
            ch = (byte) (ch >>> 4); // shift the bits down
            ch = (byte) (ch & 0x0F); // must do this is high order bit is on!
            out.append(pseudo[(int) ch]); // convert the nibble to a String
            // Character
            ch = (byte) (arayValue[i] & 0x0F); // Strip off low nibble
            out.append(pseudo[(int) ch]); // convert the nibble to a String
            // Character
            i++;
        }
        String rslt = new String(out);

        return rslt;
    }

Let me know if this helps in finding solution

EDIT:

Its an UTF-16 encoding and the characters in receivedText string is in form of extended ASCII for hindi characters

New Edit

I have new characters

String value = "?®Á?Ƕ ¡??°¿¯¾";

Which says मुकेश in hindi and dangaria in hindi. Google translator is not translating dangaria in hindi so I cannot provide you hindi version of it.

I talked to the person who is encoding he said that he removed 2 bits from the input before encoding i.e. if \u0905 represents अ in hindi then he removed \u09 from the input and converted remaining 05 in extended hexadecimal form.

So the new input string I provided you is decoded in form of above explanation. i.e. \u09 is been removed and rest is converted into extended ascii and then sent to device using USB.

Let me know if this explanation helps you in finding out solution

Était-ce utile?

La solution 2

Generally, for a byte array that you know to be a string value, you can use the following.

Assuming byte[] someBytes:

String stringFromBytes = new String(someBytes, "UTF-16");

You may replace "UTF-16" with the approprate charset, which you can find after some experimentation. This link detailing java's supported character encodings may be of help.

From the details you have provided I would suggest considering the following:

  • If you're reading a file from a USB drive, android might have existing frameworks that will help you do this in a more standard way.
  • If you most certainly need to read in and manipulate the bytes from the USB port directly, make sure that you are familiar with the API/protocol of the data you are reading. It may be that some of the bytes are control messages or something similar that cannot be converted to strings, and you will need to identify exactly where in the byte stream the string begins (and ends).

Autres conseils

I've been playing around with this a bit and have an idea of what you might need to do. It looks like the value for receivedText that you have in your posting is encoded in windows-1252 for some reason. It was probably from pasting it into this post perhaps. Providing the raw byte values would be better to avoid any encoding errors. Regardless, I was able to get that String into the following Unicode Devanagari characters:

5286T11फए ऋभ इडऒठ ऒठ फए उएओ:A011605286 ऋडऋ ठऍ:12:45 चयऍओ:(9619441121)ऋडऋ:-, ऍछडइयओठ ँ ऍछडइयओठ ऋउढ, ठइडगऑढ चय, ठइडगऑढ ऋतओयग ओडऍठओययच:- ँयओफऋ:-, एठछओ, चठअठयओतञ ऋतओयग~

With the following code:

final String receivedText = "5286T11ɬ ªË ¦¿¯¾ ¯¾ ɬ ¨¬°:A011605286 ª¿ª ¾®:12:45 ¸Í®°:(9619441121)ª¿ª:-, ®¹¿¦Í°¾ ¡ ®¹¿¦Í°¾ ª¨À, ¾¦¿µ²À ¸Í, ¾¦¿µ²À ªÂ°Íµ °¿®¾°Í͸:- ¡Í°Éª:-, ¬¾¹°, ¸¾¤¾Í°Â¼ ªÂ°Íµ~";

final Charset fromCharset = Charset.forName("x-ISCII91");
final CharBuffer decoded = fromCharset.decode(ByteBuffer.wrap(receivedText.getBytes("windows-1252")));

final Charset toCharset = Charset.forName("UTF-16");
final byte[] encoded = toCharset.encode(decoded).array();
System.out.println(new String(encoded, toCharset.displayName()));

Whether or not those are the expected characters is something you would need to tell me :)

Also, I'm not sure if the x-ISCII91 character encoding is available in Android.

hindi = new String(receivedText.getBytes(), "UTF-16");

But this does not really look like hindi.. are you sure it is encoded as UTF-16?

Edit:

String charset = "UTF-8";
hindi = new String(hindi.getBytes(Charset.forName(charset)), "UTF-16");

Replace UTF-8 with the actual charsed that resulted in your loooong String.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top