Compiler optimization causing the performance to slow down

https://stackoverflow.com/questions/2735245

02-10-2019
|

Question

I have one strange problem. I have following piece of code:

template<clss index, class policy>
inline int CBase<index,policy>::func(const A& test_in, int* srcPtr ,int* dstPtr)
{
    int width = test_in.width();
    int height = test_in.height();

    double d = 0.0; //here is the problem
    for(int y = 0; y < height; y++)
    {

        //Pointer initializations

        //multiplication involving y
        //ex: int z = someBigNumber*y + someOtherBigNumber;
        for(int x = 0; x < width; x++)
        {
            //multiplication involving x
        //ex: int z = someBigNumber*x + someOtherBigNumber;
            if(soemCondition)
            {
                // floating point calculations
            }
            *dstPtr++ = array[*srcPtr++];
        }
    }
}

The inner loop gets executed nearly 200,000 times and the entire function takes 100 ms for completion. ( profiled using AQTimer)

I found an unused variable double d = 0.0; outside the outer loop and removed the same. After this change, suddenly the method is taking 500ms for the same number of executions. ( 5 times slower).

This behavior is reproducible in different machines with different processor types. (Core2, dualcore processors).

I am using VC6 compiler with optimization level O2. Follwing are the other compiler options used :

 -MD -O2 -Z7 -GR -GX -G5 -X -GF -EHa

I suspected compiler optimizations and removed the compiler optimization /O2. After that function became normal and it is taking 100ms as old code.

Could anyone throw some light on this strange behavior?

Why compiler optimization should slow down performance when I remove unused variable ?

Note: The assembly code (before and after the change) looked same.

No correct solution

OTHER TIPS

If the assembly code looks the same before and after the change the error is somehow connected to how you time the function.

VC6 is buggy as hell. It is known to generate incorrect code in several cases, and its optimizer isn't all that advanced either. The compiler is over a decade old, and hasn't even been supported for many years.

So really, the answer is "you're using a buggy compiler. Expect buggy behavior, especially when optimizations are enabled."

I don't suppose upgrading to a modern compiler (or simply testing the code on one) is an option?

Obviously, the generated assembly can not be the same, or there would be no performance difference.

The only question is where the difference lies. And with a buggy compiler, it may well be some completely unrelated part of the code that suddenly gets compiled differently and breaks. Most likely though, the assembly code generated for this function is not the same, and the differences are just so subtle you didn't notice them.

Declare width and height as const {unsigned} ints. {The unsigned should be used since heights and widths are never negative.}

const int width = test_in.width();
const int height = test_in.height();

This helps the compiler with optimizing. With the values as const, it can place them in the code or in registers, knowing that they won't change. Also, it relieves the compiler of having to guess whether the variables are changing or not.

I suggest printing out the assembly code of the versions with the unused double and without. This will give you an insight into the compiler's thought process.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow