Question

I have a file containing UTF32 that was read from a database. I would expect "Hi" to become H\0\0\0i\0\0\0, however it actualy is \0\0\0H\0\0\0i, with the null chars in front.

Does anyone know how this could happen, and how i can decode this leaving all data intact?

Was it helpful?

Solution

You appear to be getting utf-32 in network byte order rather than the reverse order you are expecting. Either order is valid for utf-32.

What byte order the database uses when you ask for utf-32 will probably be controlled by a that db's configuration.

You can use IPAddress.NetworkToHostOrder to convert a single code point, or UTF32Encoding with appropriate byte order to convert strings:

        var bytes = new byte[] {0,0,0,(byte)'H',0,0,0,(byte)'i'};
        var encoding = new UTF32Encoding(true, false);
        var text = encoding.GetString(bytes);

        Console.WriteLine(text);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top