Question

I need to optimize an expression of the form:

(a > b) || (a > c)

I tried several optimized forms one of which is as follows:

(a * 2) > (b + c)

Optimization is not from the compiler's point of view. I would like to reduce the two >s to one.

This is based on the assumption that 1 <= (a, b, c) <= 26

However, this works only for some cases. Is the optimization I am trying to do, really possible? If yes, a start would be really helpful.

Was it helpful?

Solution 3

The best I can come up with is this

char a, b, c;
std::cin >> a >> b >> c;

if (((b-a) | (c-a)) & 0x80) {
    // a > b || a > c
}

With gcc -O2 this generates only one conditional branch

40072e:       29 c8                   sub    %ecx,%eax
400730:       29 ca                   sub    %ecx,%edx
400732:       09 d0                   or     %edx,%eax
400734:       a8 80                   test   $0x80,%al
400736:       74 17                   je     40074f <main+0x3f>

This leverages the constraints of the input values, since the values cannot be greater than 26 then subtracting a from b will give you a negative value when a > b, in two's complement you know bit 7 will be set in that case - the same applies to c. I then OR both so that bit 7 indicates whether a > b || a > c, lastly we inspect bit 7 by AND with 0x80 and branch on that.

Update: Out of curiosity I timed 4 different ways of coding this. To generate test data I used a simple linear congruential pseudo-random number generator. I timed it in a loop for 100 million iterations. I assumed for simplicity that if the condition is true we want to add 5 to a counter, do nothing otherwise. I timed it using g++ (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) on an Intel Xeon X5570 @ 2.93GHz using -O2 optimization level.

Here's the code (comment out all but one of the conditional variants):

#include <iostream>
unsigned myrand() {
    static unsigned x = 1;
    return (x = x * 1664525 + 1013904223);
}

int main() {
    size_t count = 0;
    for(size_t i=0; i<100000000; ++i ) {
        int a = 1 + myrand() % 26;
        int b = 1 + myrand() % 26;
        int c = 1 + myrand() % 26;

        count += 5 & (((b-a) | (c-a)) >> 31);       // 0.635 sec
        //if (((b-a) | (c-a)) & 0x80) count += 5;     // 0.660 sec
        //if (a > std::max(b,c)) count += 5;          // 0.677 sec
        //if ( a > b || a > c) count += 5;            // 1.164 sec
    }
    std::cout << count << std::endl;
    return 0;
}

The fastest is a modification on the suggestion in my answer, where we use sign extension to generate a mask that is either 32 1s or 32 0s depending on whether the condition is true of false, and use that to mask the 5 being added so that it either adds 5 or 0. This variation has no branches. The times are in a comment on each line. The slowest was the original expression ( a > b || a > c).

OTHER TIPS

The answer is probably: you do not want to optimize that. Moreover, I doubt that there's any way to write this more efficiently. If you say that a, b and c are values between 1 and 26, you shouldn't be using integers (you don't need that precision) if you wanted to be optimal (in size) anyway.

If a > b, the expression a > c will not be executed anyway. So you have at maximum 2 (and at minimum 1) conditional operations, which is really not worth an optimization.

I'm quite doubtful this is even an optimisation in most cases.

 a > b || a > c 

will evaluate to:

 compare a b
 jump not greater
 compare a c
 jump not greater

where

 a * 2 > b + c

gives:

 shift a left 1 (in temp1)
 add b to c (in temp2)
 compare temp1 temp2
 jump if not greater

As always with performance, it's always much better to base your decision on actual performance measurements (preferably on a selection of processor architectures).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top