Performance of branch prediction in a loop

https://stackoverflow.com/questions/12251160

30-06-2021
|

Question

Would there be any noticeable speed difference between these two snippets of code? Naively, I think the second snippet would be faster because branch instructions are encountered a lot less, but on the other hand the branch predictor should solve this problem. Or will it have a noticeable overhead despite the predictable pattern? Assume that no conditional move instruction is used.

Snippet 1:

for (int i = 0; i < 100; i++) {
    if (a == 3)
        output[i] = 1;
    else
        output[i] = 0;
}

Snippet 2:

if (a == 3) {
    for (int i = 0; i < 100; i++)
        output[i] = 1;
} else {
    for (int i = 0; i < 100; i++)
        output[i] = 0;
}

I'm not intending to optimise these cases myself, but I would like to know more about the overhead of branches even with a predictable pattern.

Solution

Since a remains unchanged once you enter into the loop, there shouldn't be much difference between the two code-snippet.

Personally, I would prefer the former, unless branch predictor fails to predict the branch which is really unlikely, given that a remains unchanged in the loop.

Moreover, the compiler may perform this optimization:

Loop unswitching

thereby making both code-snippets emit exactly same machine instructions.

OTHER TIPS

You asked a performance question without specifying hardware (although from the question we can infer that it's one of the architectures that have branch prediction), toolchain, or compile options.

Overall, this is just another space vs speed tradeoff, where space often itself affects speed (CPU instruction and microcode caches).

The only reasonable answer is "Performance will vary depending on processor hardware and compiler optimizations."

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow