Question

If I am in the ANSI codepage only environment.

Does this conversion wide char to char:

char ansi_cstr[size_of_ansi_str];
WideCharToMultiByte(CP_ACP, 0, ansi_wstr.c_str(), -1, ansi_str, size_of_ansi_str, 0, 0);
std::string ansi_str = std::string(ansi_cstr);

equal to following

std::string ansi_str = std::string(ansi_wstr.begin(), ansi_wstr.end());

and char to wide char

wchar_t ansi_wcstr[size_of_ansi_str];
MultiByteToWideChar(CP_ACP, 0, ansi_str.c_str(), -1, ansi_wcstr, size_of_ansi_str);
std::wstring ansi_wstr = std::wstring(ansi_wcstr);

equal to

std::wstring ansi_wstr = std::wstring(ansi_str.begin(), ansi_str.end());

Are these two cases remain the same behavior in the ansi codepage only environment?

Was it helpful?

Solution

There's no such thing as the ANSI code page environment. There are dozens.

Your two "shortcut" conversions are incorrect in all of them.

The conversion from ASCII char to UTF-16 wchar_t would work with your last method, but this fails with the second half of most ANSI code pages. It works best with the Western European code page, where it gets ~only 32 characters wrong. For instance. the Euro sign € will always be mis-converted.

OTHER TIPS

WideCharToMultiByte(CP_ACP, 0, ansi_wstr.c_str(), -1, ansi_str, size_of_ansi_str, 0, 0);

IS NOT the same as

std::string ansi_str = std::string(ansi_wstr.begin(), ansi_wstr.end());

WideCharToMultiByte() performs a real conversion from UTF-16 to ANSI using the codepage that CP_ACP refers to on that PC (which can be different on each PC based to user locale settings). std::string(begin, end) merely loops through the source container type-casting each element to char and does not perform any codepage conversion at all.

Likewise:

MultiByteToWideChar(CP_ACP, 0, ansi_str.c_str(), -1, ansi_wcstr, size_of_ansi_str);

IS NOT the same as

std::wstring ansi_wstr = std::wstring(ansi_str.begin(), ansi_str.end());

For the same reason. MultiByteToWideChar() performs a real conversion from ANSI to UTF-16 using the CP_ACP codepage, whereas std::wstring(begin, end) simply type-casts the source elements to wchar_t without any conversion at all.

The type-casts would be equivelent to the API conversions ONLY if the source strings are using ASCII characters in the 0x00-0x7F range. But if they are using non-ASCII characters, all bets are off.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top