Question

I don't like switch in C and its descendants. Maybe it's the need for a break or the clumsiness of the syntax requiring a shifting of mental gears, I don't know.

Whatever the reason, I will almost always write a sequence of else-if conditions examining the same variable rather than write a switch.

I see from Is there any significant difference between using if/else and switch-case in C#? that switch is more efficient in most cases and it got me wondering...

How would a compiler (for a language with C-like if, else and switch) detect a number of else-if conditions on the same value and internally re-write it into a switch?

Was it helpful?

Solution

Your optimization wish (of transforming a sequence of if into the equivalent of a switch) makes sense only (as Jörg W Mittag wisely commented) for idempotent conditions (which are undecidable to detect in general). But we could restrict these conditions to be a disjunction of equality tests like x == k where x is always the same (SSA-form) simple scalar variable (e.g. some declared int local variable in C or C++) which is unmodified in then parts and k is a compile-time constant.

As far as I know, some versions of GCC in some optimizations modes have been able of such an optimization. And I just checked (in march 2015), my Clang 3.5 (invoked as clang -O3 -fverbose-asm -S) is doing such an optimization with the code below, while my GCC 4.9 don't do it (on GNU/Linux/Debian/Sid/x86-64) :

 extern void f_a(int);
 extern void f_b(int);
 extern void f_c(int);
 extern void f_d(int);
 extern void f_e(int);
 extern void f_f(int);
 extern void f_g(int);
 extern void f_h(int);
 extern void f_i(int);
 extern void f_j(int);
 extern void f_k(int);
 extern void f_l(int);
 extern void f_m(int);
 extern void f_n(int);

 void f1(int x, int y)
 {
   if (x=='a') f_a(y);
   else if (x=='b') f_b(y);
   else if (x=='c') f_c(y);
   else if (x=='d') f_d(y);
   else if (x=='e') f_e(y);
   else if (x=='f') f_f(y);
   else if (x=='g') f_g(y);
   else if (x=='h') f_h(y);
   else if (x=='i') f_i(y);
   else if (x=='j') f_j(y);
   else if (x=='k') f_k(y);
   else if (x=='l') f_l(y);
   else if (x=='m') f_m(y);
   else if (x=='n') f_n(y);
 }    

I guess that GCC don't optimize like you want because it is not worth doing. GCC is often producing slightly faster code than Clang.

BTW, switch is not always compiled into an indirect jump. In particular a switch with a dozen of widely spread cases is not compiled using a jump table, e.g. code like

unsigned u = something(x); /// practically would be some hash function
switch (u) {
  case 3935272677: do_some_A(x); break;
  case 2389398680: do_some_B(x); break;
  case 1384792968: do_some_C(x); break;
  case 3957729125: do_some_D(x); break;
  case 4174747340: do_some_E(x); break;
  /// etc...
  default: do_other_thing(x); break;
}

In practice with gcc -O3 -fverbose-asm -S such code is compiled as a tree of branches.

You should read the paper by Roger Anthony Sayle A Superoptimizer Analysis of Multiway Branch Code Generation (GCC Summit 2008)

Also, match expressions in Ocaml are perhaps relevant, they are not always compiled as naive sequence of tests, but usually as a tree of tests.

There is a significant literature on compiling pattern matching. See Rete algorithm for a start, and Luc Maranget & Fabrice Le Fessant paper on Optimizing Pattern Matching (ICFP2001); you'll find a lot of papers and conferences on the subject.

BTW, your implicit hypothesis that some indirect branch is always faster than a serie of jumps is naive (on current processors), because of TLB, CPU caches, branch predictors, instruction pipelines, conditional moves machine instructions, etc etc... Read also about threaded code and notice that both GCC and Clang/LLVM accept the (very useful) label as values and computed goto extension to C.

Read also about GNU gperf perfect hash generator.

At last, you should not care and leave such micro-optimizations to the compiler ! If you feel (but I don't) that a long series of if is more readable than a long switch code as you prefer.

Licensed under: CC-BY-SA with attribution
scroll top