Compiler optimization on marking an int unsigned?

https://stackoverflow.com/questions/4889707

28-10-2019
|

Question

For an integer that is never expected to take -ve values, one could unsigned int or int. From a compiler perspective or purely cpu cycle perspective is there any difference on x86_64 ?

Solution

It depends. It might go either way, depending on what you are doing with that int as well as on the properties of the underlying hardware.

An obvious example in unsigned ints favor would be the integer division operation. In C/C++ integer division is supposed to round towards zero, while machine integer division on x86 rounds towards negative infinity. Also, various "optimized" replacements for integer division (shifts, etc.) also generally round towards negative infinity. So, in order to satisfy standard requirements the compiler are forced to adjust the signed integer division results with additional machine instructions. In case of unsigned integer division this problem does not arise, which is why generally integer division works much faster for unsigned types than for signed types.

For example, consider this simple expression

rand() / 2

The code generated for this expression by MSVC complier will generally look as follows

call        rand
cdq              
sub         eax,edx 
sar         eax,1

Note that instead of a single shift instruction (sar) we are seeing a whole bunch of instructions here, i.e our sar is preceded by two extra instructions (cdq and sub). These extra instructions are there just to "adjust" the division in order to force it to generate the "correct" (from C language point of view) result. Note, that the compiler does not know that your value will always be positive, so it has to generate these instructions always, unconditionally. They will never do anything useful, thus wasting the CPU cycles.

Not take a look at the code for

(unsigned) rand() / 2

It is just

call        rand  
shr         eax,1

In this case a single shift did the trick, thus providing us with an astronomically faster code (for the division alone).

On the other hand, when you are mixing integer arithmetics and FPU floating-point arithmetics, signed integer types might work faster since the FPU instruction set contains immediate instruction for loading/storing signed integer values, but has no instructions for unsigned integer values.

To illustrate this one can use the following simple function

double zero() { return rand(); }

The generated code will generally be very simple

call        rand 
mov         dword ptr [esp],eax 
fild        dword ptr [esp]

But if we change our function to

double zero() { return (unsigned) rand(); }

the generated code will change to

call        rand
test        eax,eax 
mov         dword ptr [esp],eax 
fild        dword ptr [esp] 
jge         zero+17h 
fadd        qword ptr [__real@41f0000000000000 (4020F8h)]

This code is noticeably larger because the FPU instruction set does not work with unsigned integer types, so the extra adjustments are necessary after loading an unsigned value (which is what that conditional fadd does).

There are other contexts and examples that can be used to demonstrate that it works either way. So, again, it all depends. But generally, all this will not matter in the big picture of your program's performance. I generally prefer to use unsigned types to represent unsigned quantities. In my code 99% of integer types are unsigned. But I do it for purely conceptual reasons, not for any performance gains.

OTHER TIPS

Signed types are inherently more optimizable in most cases because the compiler can ignore the possibility of overflow and simplify/rearrange arithmetic in whatever ways it sees fit. On the other hand, unsigned types are inherently safer because the result is always well-defined (even if not to what you naively think it should be).

The one case where unsigned types are better optimizable is when you're writing division/remainder by a power of two. For unsigned types this translates directly to bitshift and bitwise and. For signed types, unless the compiler can establish that the value is known to be positive, it must generate extra code to compensate for the off-by-one issue with negative numbers (according to C, -3/2 is -1, whereas algebraically and by bitwise operations it's -2).

It will almost certainly make no difference, but occasionally the compiler can play games with the signedness of types in order to shave a couple of cycles, but to be honest it probably is a negligible change overall.

For example suppose you have an int x and want to write:

if(x >= 10 && x < 200) { /* ... */ }

You (or better yet, the compiler) can transform this a little to do one less comparison:

if((unsigned int)(x - 10) < 190) { /* ... */ }

This is making an assumption that int is represented in 2's compliment, so that if (x - 10) is less that 0 is becomes a huge value when viewed as an unsigned int. For example, on a typical x86 system, (unsigned int)-1 == 0xffffffff which is clearly bigger than the 190 being tested.

This is micro-optimization at best and best left up the compiler, instead you should write code that expresses what you mean and if it is too slow, profile and decide where it really is necessary to get clever.

I don't imagine it would make much difference in terms of CPU or the compiler. One possible case would be if it enabled the compiler to know that the number would never be negative and optimize away code.

However it IS useful to a human reading your code so they know the domain of the variable in question.

From the ALU's point of view adding (or whatever) signed or unsigned values doesn't make any difference, since they're both represented by a group of bit. 0100 + 1011 is always 1111, but you choose if that is 4 + (-5) = -1 or 4 + 11 = 15.
So I agree with @Mark, you should choose the best data-type to help others understand your code.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow