Question

I tried comparing arithmetic addition with a function using bitwise operations i wrote - to find that the latter was almost 10x slower. What could be the reason for such disparity in speed? Since i am adding to the same number in the loop, does the compiler rewrite it to something more optimal in the first case?

Using arithmetic operation:

int main()
{
    clock_t begin = clock();
    int x;
    int i = 1000000000;
    while(i--) {
    x = 1147483000 + i;
    }
    printf("%d\n", x);

    clock_t end = clock();
    double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;

    printf("time spent = %f\n", time_spent);
    return 0;
}

Output:

1147483000
time spent = 3.520000

Using bitwise operators:

The line inside while loop was replaced with:

x = add(1147483000, i);

and here's the add function:

int add(int x, int y) {
    while(y != 0) {
        int carry = (x & y);
        x = x ^ y;
        y = carry << 1;
    }
    return x;
}

Output:

1147483000
time spent = 32.940000
Was it helpful?

Solution

Integer arithmetic is performed in hardware typically in a very small number of clock cycles.

You will not be able to get close to this performance in software. Your implementation using bitwise operations involves a function call and a loop. The bitwise operations that you perform typically cost similar numbers of clock cycles as arithmetic.

You are performing three bitwise operations per iteration. Frankly, I'm astonished that there is only a factor of 10 here.

I also wonder what your compiler settings are, specifically any optimizations. A good compiler could eliminate your while loop in the arithmetic version. For performance comparisons you should be comparing optimised code. It looks as if you might not be doing so.

It's difficult to know what you are trying to achieve here, but do not expect to beat the performance of hardware arithmetic units.

OTHER TIPS

You have replaced this:

x = 1147483000 + i;

with this:

while(y != 0) {
    int carry = (x & y);
    x = x ^ y;
    y = carry << 1;
}

Of course you get a huge slow-down! + of two integers is one assembly instruction. Your while loop executes many instructions, effectively simulating in software what the hardware does when it executes an addition.


To elaborate more, this is how a full adder looks like. With 32-bit addition, the ALU contains 32 of this unit cascaded. Each of those hardware elements have very very small delay. The delay of the wires is negligible. So if the software adds two 32-bit numbers together, it takes very very little time.

On the other hand, if you try to simulate the addition by hand, you make the CPU go to memory, fetch and decode some instructions 32 times which takes considerably longer time.

enter image description here

When you replace addition with a function call: calling the function is more time intensive than simple addition because function call is associated with stack operations.

In the function you are replacing addition with three bitwise operations - how fast they are compared to addition may be an issue- though not confirm about that without testing. Can you post the individual times for the three bitwise operations here?:

1.

//tic
while(i--) {
    int carry = (x & y);
}
//toc

2.

//tic
while(i--) {
    x = x ^ y;
}
//toc

3.

//tic
while(i--) {
    y = carry << 1;
}
//toc

But function call should be the main reason.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top