the idea behind unsigned integer [duplicate]

https://stackoverflow.com/questions/12786579

06-07-2021
|

Question

Possible Duplicate:
What happens if I assign a negative value to an unsigned variable?

I'm new at C++ and I want to know how to use unsigned types. For the unsigned int type, I know that it can take the values from 0 to 4294967296. but when I want to initialize an unsigned int type as follows:

unsigned int x = -10;
cout << x;

The output seems like 4294967286 The got this output = max value - 10. So I want to learn what is happening in the memory? What kind of processes are being done while this calculation is continuing? Thanks for your answers.

Solution

You're encountering wrap around behavior.

Unsigned types are cyclic (signed types, on the other hand, may or may not be cyclic, but it's undefined behavior that you shouldn't rely on). That is to say, one less than the minimum possible value is the maximum possible value. You can demonstrate this yourself with the following snippet:

int main()
{
    unsigned int x = 5;
    for (int i = 0; i < 10; ++i) cout << x-- << endl;
    return 0;
}

You'll notice that after reaching zero, the value of x jumps to 2^32-1, the maximum representable value. Subtracting further acts as expected.

When you subtract 1 from unsigned 0, the bit pattern changes in the following way:

0000 0000 0000 0000 0000 0000 0000 0000 // before (0)
1111 1111 1111 1111 1111 1111 1111 1111 // after  (2^32 - 1)

With unsigned numbers, negative numbers are treated like positive numbers subtracted from zero. So (unsigned int) -10 will equal ((unsigned int) 0) - ((unsigned int) 10).

I like to think about it as an unsigned int being the lowest 32 bits of a higher-precision arbitrary value. Like this:

v imaginary high order bit
1 0000 0000 0000 0000 0000 0000 0000 0000 // before (2^32)
0 1111 1111 1111 1111 1111 1111 1111 1111 // after  (2^32 - 1)

The behavior of the unsigned int in these overflow cases is exactly the same as the behavior of the low 8 bits of an unsigned int when you subtract 1 from 256. It makes more sense to look at an unsigned char (1 byte) like this, because the values 0 and 256 are equal if casted to unsigned char, since the limited precision discards the extra bits.

0 0000 0000 0000 0000 0000 0001 0000 0000 // before (256)
0 0000 0000 0000 0000 0000 0000 1111 1111 // before (255)

As others have pointed out, this is called modulo arithmetic. Using higher precision values to help visualize the transitions made when wrapping around works because you mask off high order bits. It doesn't matter what it was, so it can be anything, it just gets discarded. Integers are values over modulus 2^32, so any multiples of 2^32 equal zero in the space of an integer. That's why I can get away with pretending there's an extra bit on the end.

Modulus operations have their own dedicated operator in case you need to compute them for numbers other than 2^32 in your programs, as used in this statement:

int forty_mod_twelve = 40 % 12;
// value is 4: 4 + n * 12 == 40 for some whole number n

Modulus operations on powers of two (like 2^32) simplify directly to masking off high order bits, and if you take a 64 bit integer and compute it modulo 2^32, the value will be exactly the same as if you had converted it to an unsigned int.

01011010 01011100 10000001 00001101 11111111 11111111 11111111 11111111 // before
00000000 00000000 00000000 00000000 11111111 11111111 11111111 11111111 // after

Programmers like to use this property to speed up programs, because it's easy to chop off some number of bits, but performing a modulus operation is much harder (it's about as hard as doing a division).

Does that make sense?

OTHER TIPS

This involves the standard integral conversions. Here's the applicable rule. We start with the type of the literal 10:

2.14.2 Integer literals [lex.icon]

An integer literal is a sequence of digits that has no period or exponent part. An integer literal may have a prefix that speciﬁes its base and a suffix that speciﬁes its type. The lexically first digit of the sequence of digits is the most signiﬁcant. A decimal integer literal (base ten) begins with a digit other than 0 and consists of a sequence of decimal digits. An octal integer literal (base eight) begins with the digit 0 and consists of a sequence of octal digits. A hexadecimal integer literal (base sixteen) begins with 0x or 0X and consists of a sequence of hexadecimal digits, which include the decimal digits and the letters a through f and A through F with decimal values ten through fifteen. [ Example: the number twelve can be written 12, 014, or 0XC. — end example ]

The type of an integer literal is the ﬁrst of the corresponding list in Table 6 in which its value can be represented.

A table follows, the first type is int and it fits. So the literal's type is int.

The unary minus operator is applied, which doesn't change the type. Then the following rule is applied:

4.7 Integral conversions [conv.integral]

A prvalue of an integer type can be converted to a prvalue of another integer type. A prvalue of an unscoped enumeration type can be converted to a prvalue of an integer type.

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2ⁿ where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]

Instead of printing the value the way you are, print it in hexadecimal format (sorry, I forget how to do that with cout but I know it's possible). You'll see that the representation is the same for both values.

From your context, an integer is 32 bits (this is not always the case). When using a signed integer, the most significant bit is the sign, not part of the value. When using an unsigned integer, the most significant bit is part of the value.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow