How much does the order of case labels affect the efficiency of switch statements?

https://stackoverflow.com/questions/1827406

11-09-2019
|

Question

Consider:

if (condition1)
{
    // Code block 1
}
else
{
    // Code block 2
}

If I know that condition1 will be true the majority of the time, then I should code the logic as written, instead of:

if (!condition1)
{
    // Code block 2
}
else
{
    // Code block 1
}

since I will avoid the penalty of the jump to the second code block (note: I have limited knowledge of assembly language). Does this idea carry forward to switch statements and case labels?

switch (myCaseValue)
{
    case Case1:
        // Code block 1
        break;

    case Case2:
        // Code block 2
        break;

    // etc.
}

If I know that one of the cases will happen more often, can I rearrange the order of the case labels so that it's more efficient? Should I? In my code I've been ordering the case labels alphabetically for code readability without really thinking about it. Is this micro-optimization?

Solution

Some facts for modern hardware like x86 or x86_64:

A unconditionally taken branch has almost no additional costs, besides the decoding. If you want a number, it's about a quarter clock cycle.
A conditional branch, which was correctly predicted, has almost no additional costs.
A conditional branch, which was not correctly predicted, has a penalty equal to the length of the processor pipelines, this is about 12-20 clocks, depending on the hardware.
The prediction mechanisms are very sophisticated. Loops with a low number of iterations (on Core 2 for example up to 64) can be perfectly predicted. Small repeating patterns like "taken-taken-nottaken-taken" can be predicted, if they are not too long (IIRC 6 on Core2).

You can read more about branch prediction in Agner Fogs excellent manual.

Switch statements are usually replaced by a jump table by the compiler. In most cases the order of cases won't make a difference at all. There are prediction mechanisms for indirect jumps as well.

So the question isn't if your jumps are more likely to be taken, it is if they are well predictable, at least for the hardware you intend to run your code on. This isn't an easy question at all. But if you have branches depending on a random (or pseudo random) condition, you could try to reformulate it as a branchless statement if possible.

OTHER TIPS

Your conclusion regarding the if statements will not be true on most of the hardware I'm familiar with. The problem is not that you are jumping, but that you are branching. The code could go two different ways, depending on the result of a comparison. This can stall the pipeline on most modern CPUs. Branch prediction is common, and fixes the problem most of the time, but has nothing to do with your example. The predictor can equally well predict that a comparison will be false as it can that it will be true.

As usual, see wikipedia: Branch Predictor

It depends. The compiler will use a bunch of internal implementation-dependent criteria to decide whether to implement the switch as a sequence of if-like tests, or as a jump table. This might depend, for example, on how "compact" your set of case labels is. If your case label values form a "dense" set, the compiler is probably more likely to use a jump table, in which case the ordering of case labels won't matter. If it decides to go with what resembles a sequence of if-else tests, the order might matter.

Keep in mind though, that the body of switch is one large statement, with case labels providing multiple entry points into that statement. For that reason, the compilers ability (as well as yours) to rearrange the case "sub-blocks" within that statement might be limited.

Case labels should be ordered in the most effecient way for readability.

Reordering case labels for efficiency is a case of premature optimization unless a profiler has specifically told you this is a problem.

I think that even your initial premise - that you can optimize the if statement by rearranging the conditional may well be faulty. In a non-optimized build you might find doing what you're talking about has some value - maybe. In the general case you're going to have to jump at least once for either case, so there's no advantage (in general) to arranging the conditional anyway. But that's for non-optimized builds, so who cares about that optimization?

In optimized builds, I think you might be surprised by what a compiler sometimes generates for an if statement. The compiler may move one or the other (or both) cases to somewhere out-of-line. I think that you trying to optimize this naively by playing with which condition 'comes first' won't necessarily do what you want. At best you should do this only after examining what the compiler is generating. And, of course, this becomes an expensive process, since even the slightest change you make around the statement can change how the compiler decides to generate the output code.

Now, as far as the switch statement is concerned, I'd always go with using a switch when it makes the code more readable. The worst that a compiler should do with a switch statement that is equivalent to an if statement is to generate the same code. For more than a few cases, switch statements will generally be compiled as a jump table. But then again a set of if tests that are comparing a single variable to a set of values might very well be recognized by a compiler such that it'll do the same. However, I'd guess that using a switch will enable to compiler to recognize the situation much more readily.

If you're really interested in getting the most out of the performance of that conditional, you might consider using something like MSVC's Profile Guided Optimization (PGO or 'pogo')which uses the results of profiling runs to optimize how conditionals get generated. I don't know whether or not if GCC has similar capabilities.

I'm not sure about the C# compiler, but I know that in assembly a switch statement can actually be programmed as a jump to a specific line, rather than evaluating the expression like an if statement. Since in a select you have all constants, it just treats cases as line numbers and you jump directly to the line number (case value) passed in without any evaluation. This makes the order of the case statements not really matter at all.

I assume you're aware that it will only matter if this is a hotspot. The best way to tell if it's a hotspot is to run the code, sample the program counter, and see if it's in there more than 10% of the time. If it is a hotspot, see how much time is spent in doing the if or switch. Usually it is negligible, unless your Block 1 and/or Block 2 do almost nothing. You can use a profiler for this. I just pause it repeatedly.

If you're not familiar with assembly language I would suggest learning it, enough to understand what the compiler generates. It's interesting and not hard.

As others have said, it depends on lots of things, including how many cases there are, how optimization is done, and the architecture you're running on. For an interesting overview, see http://ols.fedoraproject.org/GCC/Reprints-2008/sayle-reprint.pdf

If you put the cases that happen most often first, this will optimize the code slightly, and because of the way switch statments work the same is true. When the program goes into switch and finds a case that's true, it will execute it and hit break, which will exit out of the loop. Your thinking is correct.

However, I do think this optimization is pretty minimal, and if it slows your development time to do this, it's probably not worth it. Also if you have to modify your program flow drastically to accommodate this, it's probably not worth it. You're only saving a couple cycles at most and likely would never see the improvement.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow