wchar_t is unsigned or signed
Question
In this link unsigned wchar_t
is typedef
ed as WCHAR
. But I cant find this kind of typedef in my SDK winnt.h
or mingw winnt.h
.
wchar_t
is signed or unsigned?
I am using WINAPIs in C language.
Solution
The signedness of wchar_t
is unspecified. The standard only says (3.9.1/5):
Type
wchar_t
shall have the same size, signedness, and alignment requirements (3.11) as one of the other integral types, called its underlying type.
(By contrast, the types char16_t
and char32_t
are expressly unsigned.)
OTHER TIPS
Be aware the type will vary in length by platform.
Windows uses UTF-16 and a wchar_t is 2 bytes. Linux uses a 4 byte wchar_t.
I just tested on several platforms, with no optimisation.
1) MinGW (32-bit) + gcc 3.4.4:
---- snip ----
#include<stdio.h>
#include<wchar.h>
const wchar_t BOM = 0xFEFF;
int main(void)
{
int c = BOM;
printf("0x%08X\n", c+0x1000);
return 0;
}
---- snip ----
It prints 0x00010EFF
. wchar_t
is unsigned.
Corresponding assembly code says movzwl _BOM, %eax
. Not movSwl
, but movZwl
.
2) FreeBSD 11.2 (64-bit) + clang 6.0.0:
---- snip ----
#include<stdio.h>
#include<wchar.h>
const wchar_t INVERTED_BOM = 0xFFFE0000;
int main(void)
{
long long c = INVERTED_BOM;
printf("0x%016llX\n", c+0x10000000LL);
return 0;
}
---- snip ----
It prints 0x000000000EFF0000
. wchar_t
is signed.
Corresponfing assembly code says, movq $-131072, -16(%rbp)
. The 32-bit 0xFFFE0000
is promoted to 64-bit signed -131072
.
3) Same code as 2), on RedHat (version unknown) + gcc 4.4.7: It again prints 0x000000000EFF0000
. wchar_t
is signed.
I tested neither the printf
's implementation nor WinAPI's WCHAR
definition, but the behaviors of compiler-builtin wchar_t
type (no specification about its signedness on any header file) and C-to-ASM compiler engine.
Note that the compilers on 1) and 3) are provided by the same vendor, namely the GNU Project. The answer definitely depends on platforms. (Would somebody test on Visual C++?)