Question

In a C implementation that uses 2's complement integers, if an signed integer is negative, then the highest bit is 1, otherwise 0.

Lets take char and unsigned char, the range for a signed char is -128 to 127 and unsigned char is 0 to 255, but in fact their hexadecimal are in range of 0x00 to 0xff. My question is now if a char and unsigned char are stored in memory using 8 bits binary number, how does the computer itself know whether it is signed or unsigned?

char a = 0xff; printf("%d", a); //its result is -1.
unsigned char a = 0xff; printf("%d", a); //its result is 255.

In the example above, how does printf know the value of 0xff is signed or unsigned? Is that only depends on the definition of a?

Was it helpful?

Solution

There are many related questions here, for example this one.

Your question is not quite right: for signed the highest bit is not always 1 -- only if the value is negative. In fact, signed or unsigned are "types" attributed to the exact same bit patters, and how these bit patterns are interpreted when compared or promoted is defined by their respective types.

For example:

unsigned char u = 0xFF; // decimal 255
signed char s = 0xFF; // decimal -1

You can see how both values are the same, in both the highest bit is set, but they differ in their types.

The compiler uses a type system to know how to interpret values, and it is the task of the programmer to assign meaningful types to values. In the above example, I told the compiler that the first 0xFF should be interpreted as an unsigned value (see also the include file limits.h) with the maximum range:

u = 0x00; // decimal 0, CHAR_MIN
u = 0xFF; // decimal 255, UCHAR_MAX

and the second 0xFF as a signed value with the maximum range:

s = 0x00; // decimal 0, CHAR_MIN
s = 0x7F; // decimal 127, SCHAR_MAX
s = 0x80; // decimal -127, SCHAR_MIN (note how 0x7F + 1 = 0x80, decimal 127 + 1 = -127, called an overflow)
s = 0xFF; // decimal -1

For the printf in your example, the %d tells it to expect a signed int value. According to the integer promotion rules of the C language the smaller char type is either sign-extended (if it's signed type) or zero-extended (if it's unsigned type). To finish with the above example:

printf("%d", u); // passes a int 0x000000FF, decimal 128, to the function
printf("%d", s); // passes a int 0xFFFFFFFF, decimal -1, to the function

More printf formatting specifiers are here, for example %u might be interesting for you in this context.

OTHER TIPS

On a printf() call (and other occasions), the integer promotion rules apply.

The compiler converts the given value into an int. This happens depending on the signedness of the char (which is, of course, known by the compiler). So depending on the signedness of the char, the compiler does the conversion by filling with 0 bits or with the highest bit of the char.

Lets take char and unsigned char, the range for a signed char is -128 to 127 and unsigned char is 0 to 255, but in fact their hexadecimal are in range of 0x00 to 0xff.

This statement is confusing and misleading. 0xFF is just another way to write 255. You could just as well have said 'In hexadecimal the range for a signed char is -0x80 to 0x7F and for a signed char is 0x00 to 0xFF.'

My question is now if a char and unsigned char are stored in memory using 8 bits binary number, how does the computer itself know whether it is signed or unsigned?

The computer doesn't know. You tell it whether you want to interpret that memory as a signed number or unsigned number by typing the word unsigned.

In the example above, how does printf know the value of 0xff is signed or unsigned?

Leave printf out of it. Let's make a simpler example:

char a = 128; 

What happens? 128 is larger than the largest possible signed char (again, assuming 8 bit chars in twos complement). So the value wraps around to the smallest possible value; this becomes -128.

char a = 129;

What happens? 129 is larger than the largest possible signed char by two. So it wraps around to the second smallest possible value, -127.

char a = 130;

This is three larger than the largest possible value, so it wraps around to the third smallest possible value, -126.

.... skip a few ...

char a = 255;

this is 128 larger than the largest possible value, so it wraps around to the 128th smallest possible value, which is -1.

Got it?

OK, now that we understand that:

char a = 255;
unsigned char b = 255;

Now what happens when we say

int c = a;
int d = b;

? We have a signed integer. a we have already determined has wrapped around to -1, which is in the range of an integer, so c becomes the integer -1. b is the unsigned char 255, which is in the range of an integer, so d becomes the integer 255.

The fact that the in-memory contents of a and b are the same is irrelevant. That memory is interpreted as a number based on the type that you assigned to a and b. In particular, the conversion of that bit pattern to an integer bit pattern is entirely dependent on the type.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top