Question

In c# I can encode binary data by Encoding.UTF8.GetString() and later convert it back by binary = Encoding.UTF8.GetBytes().

I expect that the result should be my original binary data in any case - no exception.

But is it true in any case?

Or does it depend on the specific behaviour of the UTF8 character set?

Or should I better use Encoding.ASCII.GetString() and Encoding.ASCII.GetBytes()?

If anybody knows what Encoding exactly does (how it treats special characters or special bytes) then, please, give me advice.

Was it helpful?

Solution

In c# I can encode binary data by Encoding.UTF8.GetString() and later convert it back by binary = Encoding.UTF8.GetBytes().

No, because that isn't what a text encoding does.

A text encoding transforms arbitrary text to/from structured bytes (meaning: structured in the way defined by that encoding)

You have arbitrary bytes, not structured bytes. You should use base-64 (Convert.ToBase64String / Convert.FromBase64String), which converts arbitrary bytes to/from a structured string - in this case, structured according to the rules of base-64.

byte[] orig = ...
string storeThis = Convert.ToBase64String(orig);
// ...
byte[] backAgain = Convert.FromBase64String(storeThis);

OTHER TIPS

You should only use the Encoding.UTF8, when you expect the bytes are in UTF8 encoding. Using the GetString() on unknown bytes can lead to unexpected results.

So if you use Encoding.UTF8.GetBytes("Hello world!"); You can bring it back to a string with Encoding.UTF8.GetString(byteArray);

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top