The root problem why all of your approaches fail is that they require the std::string
to be UTF-8 encoded but std::string str = "консоли"
is not UTF-8 encoded unless you save the .cpp file as UTF-8 and configure your compiler's default codepage to UTF-8. In most C++11 compilers, you can use the u8
prefix to force the string to use UTF-8:
std::string str = u8"консоли";
However, VS 2013 does not support that feature yet:
Unicode string literals 2010 No 2012 No 2013 No
Windows itself does not support UTF-8 in most API functions that take a char*
as input (an exception is MultiByteToWideChar()
when using CP_UTF8
). When you call an A
function, it calls the corresponding W
function internally, converting any char*
data to/from UTF-16 using Windows' default codepage (CP_ACP
). So you get random results when you use non CP_ACP
data with functions that are expecting it. As such, MessageBoxA()
will work correctly only if your .cpp file and compiler are using the same codepage as CP_ACP
so the unprefixed char*
data matches what MessageBoxA()
is expecting.
I don't know why AUTF8ToUTF16()
is crashing, probably a bug in your compiler's STL implementation when processing bad data.
BUTF8ToUTF16()
is not handling this case in the documentation: "If the input byte/char sequences are invalid, returns U+FFFD for UTF encodings." Also, your implementation is not optimal. Use length()
instead of -1
on inputs to avoid dealing with null terminator issues.
CUTF8ToUTF16()
is not doing any error handling or validations. However converting non-valid input to question marks or U+FFFD is very common in most libraries.