Question

Can anyone answer why this statement:

unsigned int index; 
unsigned int i;  
unsigned int n;  
while (condition) {
    i = index / n / n;
}

Is faster than:

unsigned int index; 
unsigned int i;  
unsigned int n;  
unsigned int n2 = n * n;  

while (condition) {
    i = index / n2;
}

Very strange, here is the video of me demonstrating it: [removed]

The statement gets ran 400,000 times or and ends up being 0.15 seconds slower. I am cleaning my .o files and remaking with -O3 optimization each time I run the test. Using gcc (GCC) 4.4.7 20120313 on Red Hat 4.4.7-4

Update:

As suggested I converted it to assembly to check it out. My limited knowledge of assembly doesn't reveal answers though, in fact it seems the faster solution should take longer because more operations

 128:utils.cpp     ****         i = index / farm_size / farm_size;
 1221                   .loc 8 129 0
 1222 046a 8B459C           movl    -100(%rbp), %eax
 1223 046d BA000000         movl    $0, %edx
 1223      00
 1224 0472 F7B57CFF         divl    -132(%rbp)
 1224      FFFF
GAS LISTING /tmp/cc4vphk8.s             page 44
 1225 0478 BA000000         movl    $0, %edx
 1225      00
 1226 047d F7B57CFF         divl    -132(%rbp)
 1226      FFFF
 1227 0483 8945B8           movl    %eax, -72(%rbp)
 129:utils.cpp     ****         j = (index / farm_size) % farm_size;

.

 128:utils.cpp     ****         i = index / farm_area;
 1221                   .loc 8 129 0
 1222 046a 8B459C           movl    -100(%rbp), %eax
 1223 046d BA000000         movl    $0, %edx
 1223      00
 1224 0472 F775D8           divl    -40(%rbp)
 1225 0475 8945B8           movl    %eax, -72(%rbp)
GAS LISTING /tmp/ccNU8jKx.s             page 44
 129:utils.cpp     ****         j = (index / farm_size) % farm_size;

here is a side by side comparsion:

http://i.imgur.com/mB4OeFM.png

Was it helpful?

Solution

There is important information in your video that you did not put in the code.

The double-division code which runs in 1.15s is shown here

enter image description here

The single-division code which runs in 1.34s is shown here: enter image description here

One important difference which is not evident in the posted question is the set of variables that are in your loop.

In the faster code, you have i, j, k, index, farm_size.

In the slower code, you have i, j, k, index, farm_size, and farm_area.

So even though you're doing one less division, you're moving around more variables which is what is costing you the extra time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top