Question

How does C represent negative integers?

Is it by two's complement representation or by using the MSB (most significant bit)?

-1 in hexadecimal is ffffffff.

So please clarify this for me.

Was it helpful?

Solution

ISO C (C99 section 6.2.6.2/2 in this case but it carries forward to later iterations of the standard(a)) states that an implementation must choose one of three different representations for integral data types, two's complement, ones' complement or sign/magnitude (although it's incredibly likely that the two's complement implementations far outweigh the others).

In all those representations, positive numbers are identical, the only difference being the negative numbers.

To get the negative representation for a positive number, you:

  • invert all bits then add one for two's complement.
  • invert all bits for ones' complement.
  • invert just the sign bit for sign/magnitude.

You can see this in the table below:

number | two's complement    | ones' complement    | sign/magnitude
=======|=====================|=====================|====================
     5 | 0000 0000 0000 0101 | 0000 0000 0000 0101 | 0000 0000 0000 0101
    -5 | 1111 1111 1111 1011 | 1111 1111 1111 1010 | 1000 0000 0000 0101

Keep in mind that ISO doesn't mandate that all bits are used in the representation. They introduce the concept of a sign bit, value bits and padding bits. Now I've never actually seen an implementation with padding bits but, from the C99 rationale document, they have this explanation:

Suppose a machine uses a pair of 16-bit shorts (each with its own sign bit) to make up a 32-bit int and the sign bit of the lower short is ignored when used in this 32-bit int. Then, as a 32-bit signed int, there is a padding bit (in the middle of the 32 bits) that is ignored in determining the value of the 32-bit signed int. But, if this 32-bit item is treated as a 32-bit unsigned int, then that padding bit is visible to the user’s program. The C committee was told that there is a machine that works this way, and that is one reason that padding bits were added to C99.

I believe that machine they may have been referring to was the Datacraft 6024 (and it's successors from Harris Corp). In those machines, you had a 24-bit word used for the signed integer but, if you wanted the wider type, it strung two of them together as a 47-bit value with the sign bit of one of the words ignored:

+---------+-----------+--------+-----------+
| sign(1) | value(23) | pad(1) | value(23) |
+---------+-----------+--------+-----------+
\____________________/ \___________________/
      upper word            lower word

(a) Interestingly, given the scarcity of modern implementations that actually use the other two methods, there's been a push to have two's complement accepted as the one true method. This has gone quite a long way in the C++ standard (WG21 is the workgroup responsible for this) and is now apparently being considered for C as well (by WG14).

OTHER TIPS

C allows sign/magnitude, one's complement and two's complement representations of signed integers. Most typical hardware uses two's complement for integers and sign/magnitude for floating point (and yet another possibility -- a "bias" representation for the floating point exponent).

-1 in hexadecimal is ffffffff. So please clarify me in this regard.

In two's complement (by far the most commonly used representation), each bit except the most significant bit (MSB), from right to left (increasing order of magnitude) has a value 2n where n increases from zero by one. The MSB has the value -2n.

So for example in an 8bit twos-complement integer, the MSB has the place value -27 (-128), so the binary number: 1111 11112 is equal to -128 + 0111 11112 = -128 + 127 = -1

One useful feature of two's complement is that a processor's ALU only requires an adder block to perform subtraction, by forming the two's complement of the right-hand operand. For example 10 - 6 is equivalent to 10 + (-6); in 8bit binary (for simplicity of explanation) this looks like:

   0000 1010
  +1111 1010
   ---------
[1]0000 0100  = 4 (decimal)

Where the [1] is the discarded carry bit. Another example; 10 - 11 == 10 + (-11):

   0000 1010
  +1111 0101
   ---------
   1111 1111  = -1 (decimal)

Another feature of two's complement is that it has a single value representing zero, whereas sign-magnitude and one's complement each have two; +0 and -0.

For integral types it's usually two's complement (implementation specific). For floating point, there's a sign bit.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top