You seem to be assuming that Windows internally works in the specified codepage. That's not true. Windows internally works in Unicode (UTF-16). For legacy software that uses char
instead of wchar_t
, input and output are translated into the specified codepage.
I think what this means is the Windows console translates the input key strokes into characters based on the default code page
This is not correct. The mapping of key strokes to (Unicode) characters is defined by the keyboard layout. This is totally independent of the code page. E.g you could use a Chinese keyboard layout on a system using a Cyrillic code page.
- Not only makes it totally sense to use
wchar_t
, it is the recommended way. - Yes, there is an advantage: your program can process all characters supported by Windows. If you use char, you can't handle any characters that are not in the current code page.
- They are not converted - they stay what they are, namely UTF-16 characters.
Unfortunately, the command prompt itself is an 'ANSI' application, so it suffers from all of the limitations of 'ANSI', and this affects your application if you use it from the command prompt. However, a console application can be used in other ways, without a command prompt window, and then it can support Unicode fully.