Is it safe to encode and decode in c#?

https://stackoverflow.com/questions/19025452

29-06-2022
|

题

In c# I can encode binary data by Encoding.UTF8.GetString() and later convert it back by binary = Encoding.UTF8.GetBytes().

I expect that the result should be my original binary data in any case - no exception.

But is it true in any case?

Or does it depend on the specific behaviour of the UTF8 character set?

Or should I better use Encoding.ASCII.GetString() and Encoding.ASCII.GetBytes()?

If anybody knows what Encoding exactly does (how it treats special characters or special bytes) then, please, give me advice.

解决方案

In c# I can encode binary data by Encoding.UTF8.GetString() and later convert it back by binary = Encoding.UTF8.GetBytes().

No, because that isn't what a text encoding does.

A text encoding transforms arbitrary text to/from structured bytes (meaning: structured in the way defined by that encoding)

You have arbitrary bytes, not structured bytes. You should use base-64 (Convert.ToBase64String / Convert.FromBase64String), which converts arbitrary bytes to/from a structured string - in this case, structured according to the rules of base-64.

byte[] orig = ...
string storeThis = Convert.ToBase64String(orig);
// ...
byte[] backAgain = Convert.FromBase64String(storeThis);

其他提示

You should only use the Encoding.UTF8, when you expect the bytes are in UTF8 encoding. Using the GetString() on unknown bytes can lead to unexpected results.

So if you use Encoding.UTF8.GetBytes("Hello world!"); You can bring it back to a string with Encoding.UTF8.GetString(byteArray);

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow