Question

I need to write a C# string to a preallocated unmanaged buffer encoded as Utf8. Before answering, please read the following requirements:

  • No new allocations (so please, don't direct me to answers involving creating byte arrays or other instantiations)
  • No transitions to unmanaged code (no pinvoke/calli)

Currently, I'm using OpCodes.Cpblk to copy raw strings from C# to unmanaged buffers using 16 bit characters. This gives me roughly the same performance as using unmanaged memcpy on an x64 architecture and I really need the throughput to be close to that.

I am considering fixing the string as a char* and iterating over it, but implementing an encoder without jump tables would be both cumbersome and less than optimal when it comes to performance.

Was it helpful?

Solution

Use the unsafe overload

public override unsafe int GetChars(byte* bytes, int byteCount, char* chars, int charCount)

of the UTF8Encoding-class. You need to specify pointers to the string and the byte-buffer that will receive the chars. It will copy UTF-8 chars into it. No allocations will be happening but it will require unsafe code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top