When a value is converted to a signed integer type but cannot be represented in that type, overflow occurs, and the behavior is undefined. It is common to see results as if a two’s complement encoding is used and as if the low bits are stored (or, equivalently, the value is wrapped modulo an appropriate power of two). However, you cannot rely on this behavior. The C standard says that, when signed integer overflow occurs, the behavior is undefined. So a compiler may act in surprising ways.
Consider this code, compiled for a target where short int
is 16 bits:
void foo(int a, int b)
{
if (a)
{
short int x = b;
printf("x is %hd.\n", x);
}
else
{
printf("x is not assigned.\n");
}
}
void bar(int a, int b)
{
if (b == 65536)
foo(a, b);
}
Observe that foo
is a perfectly fine function on its own, provided b
is within range of a short int
. And bar
is a perfectly fine function, as long as it is called only with a
equal to zero or b
not equal to 65536.
While the compiler is in-lining foo
in bar
, it may deduce from the fact that b
must be 65536 at this point, that there would be an overflow in short int x = b;
. This implies either that this path is never taken (i.e., a
must be zero) or that any behavior is permitted (because the behavior upon overflow is undefined by the C standard). In either case, the compiler is free to omit this code path. Thus, for optimization, the compiler could omit this path and generate code only for printf("x is not assigned.\n");
. If you then executed code containing bar(1, 65536)
, the output would be “x is not assigned.”!
Compilers do make optimizations of this sort: The observation that one code path has undefined behavior implies the compiler may conclude that code path is never used.
To an observer, it looks like the effect of assigning a too-large value to a short int
is to cause completely different code to be executed.