Question

I'm working on a PInvoke wrapper for a library that does not support Unicode strings, but does support multi-byte ANSI strings. While investigating FxCop reports on the library, I noticed that the string marshaling being used had some interesting side effects. The PInvoke method was using "best fit" mapping to create a single-byte ANSI string. For illustration, this is what one method looked like:

[DllImport("thedll.dll", CharSet=CharSet.Ansi)]
public static extern int CreateNewResource(string resourceName);

The result of calling this function with a string that contains non-ASCII characters is that Windows finds a "close" character, generally this looks like it ends up being "???". If we pretend that 'a' is a non-ASCII character, then passing "cat" as a parameter would create a resource named "c?t".

If I follow the guidelines in the FxCop rule, I end up with something like this:

[DllImport("thedll.dll", CharSet=CharSet.Ansi, BestFitMapping = false, ThrowOnUnmappableChar = true)]
public static extern int CreateNewResource([MarshalAs(UnmanagedType.LPStr)] string resourceName);

This introduces a change in behavior; now when a character cannot be mapped an exception is thrown. This concerns me because this is a breaking change, so I'd like to try and marshal the strings as multi-byte ANSI but I cannot see a way to do so. UnmanagedType.LPStr is specified to be a single-byte ANSI string, LPTStr will be Unicode or ANSI depending on the system, and LPWStr is not what the library expects.

How would I tell PInvoke to marshal the string as a multibyte string? I see there's a WideCharToMultiByte() API function, could I change the signature to expect an IntPtr to a string I create in unmanaged memory? It seems like this still has many of the problems that the current implementation has (it still might have to drop or substitute characters), so I'm not sure if this is an improvement. Is there another method of marshaling that I'm missing?

Was it helpful?

Solution

ANSI is multi-byte, and ANSI strings are encoded according to the codepage currently enabled on the system. WideCharToMultiByte works the same way as P/Invoke.

Maybe what you're after is conversion to UTF-8. Although WideCharToMultiByte supports this, I don't think P/Invoke does, since it's not possible to adopt UTF-8 as the system-wide ANSI code page. At this point you'd be looking at passing the string as an IntPtr instead, although if you're doing that, you may as well use the managed Encoding class to do the conversion, rather than WideCharToMultiByte.

OTHER TIPS

Here is the best way I've found to accomplish this. Instead of marshalling as a string, marshal as a byte[]. Put the responsibility on the caller of the pinvoke function API to convert to a byte array in the most appropriate fashion. Most likely by using one of the Text.Encoding classes.

If you end up having to call WideCharToMultiByte manually, I would get rid of the p/invoke and manually marshal this using WideCharToMultiByte in a C++/CLI wrapper function. Managed C++ is much better at these interop scenarios than C# is.

Though, if this is the only p/invoke you have, it's probably not worth it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top