C++ Long switch statement or look up with a map?

https://stackoverflow.com/questions/2465738

20-09-2019
|

Question

In my C++ application, I have some values that act as codes to represent other values. To translate the codes, I've been debating between using a switch statement or an stl map. The switch would look something like this:

int code;
int value;
switch(code)
{
case 1:
    value = 10;
    break;
case 2:
    value = 15;
    break;
}

The map would be an stl::map<int, int> and translation would be a simple lookup with the code used as the key value.

Which one is better/more efficient/cleaner/accepted? Why?

Solution

Personally, I would use the map, as its use implies a data lookup - using a switch usually indicates a difference in program behavior. Furthermore modifying the data mapping is easier with a map than with a switch.

If performance is a real issue, profiling is the only way to get a usable answer. A switch may not be faster if branch mispredictions happen often enough.

Another approach to think about this is if it wouldn't make more sense to combine the code and the associated value into a datastructure, especially if the range of codes and values is static:

struct Code { int code; int value; };

Code c = ...

std::cout << "Code " << c.code << ", value " << c.value << std::end;

OTHER TIPS

If your codes are contiguous enough and their range permit you would be much better of with old-fashioned straightforward array, something like

int lookup[] = {-1, 10, 15, -1 222};

then the switch statement can be rewritten as simply as

value = lookup[code];

all other options introduce additional cost to some extent.

It rather depends on what the codes are and how many there are. Good compilers have various tricks they use to optimise switch statements, some of which they won't employ with straight if/then statements. Most are bright enough to do simple maths or use lookup/jump tables for case 0, 1, 2 or case 3, 6, 9 for example.

Of course some don't, and many are easily foiled by unusual or irregular sets of values. Also if code for handling several cases looks remarkably similar, cut and paste can lead to maintenance issues. If you have many codes but they can be divided algorithmically into groups, you might consider several/nested switch statements, for example rather than:

switch (code) {
    case 0x0001: ...
    case 0x0002: ...
    ...
    case 0x8001: ...
    case 0x8002: ...
    ...
}

You might use:

if (code & 0x8000) {
    code &= ~0x8000;
    switch (code) {
        case 0x0001: ... // actually 0x8001
        case 0x0002: ... // actually 0x8002
        ...
    }
}
else {
    switch (code) {
        case 0x0001: ...
        case 0x0002: ...
        ...
    }
}

Many language interpreters decode opcodes this way, since a single byte opcode may have additional information packed into various bits, and transcribing all possible combinations and their handlers would be repetitious and fragile. On the other hand, excessive bit mangling can defeat any optimisation by the compiler and be counter-productive.

Unless you're sure this is a real performance bottleneck I'd avoid premature optimisation: do it whichever way strikes you as reasonably robust and quick to implement. As and if your application is running too slowly, profile it and optimise accordingly.

The switch statement would be faster but if this is not in your applications performance bottleneck you should not really care about that.

Go for what makes your code easier to maintain over the long run.

Your sample is too short to make any meaningful call in that regard.

I'm partial to lookup tables myself, because unusually long switch statements seem to me to confuse a separation between code and data. I also think tables lend themselves better to future changes and maintenance.

All IMHO, of course.

I suggest using a static, constant, table of pairs. This is another form of the look up table:

struct Table_Entry
{
    int code;
    int value;
};

static const Table_Entry  lookup_table[] =
{
  {1, 10},
  {2, 15},
  {3, 13},
};

static const unsigned int NUM_TABLE_ENTRIES =
    sizeof(lookup_table) / sizeof(lookup_table[0]);

A benefit to this is that the table is generated at compile time, unlike a std::map which must be initialized during run-time. If the quantities are large, you could use std::lower_bound to find the entry, provided the table is ordered.

Another benefit is this technique is data driven. The data can change without changes to the search engine. Changes to code or process may require serious regression testing but data changes may not; YMMV.

This is similar to what a compiler might generate.

If you can use tr1 you could use unordered_map to hash the values (hashing ints can be really fast too), which should make most lookups constant time.

However unless you have profiling data to indicate that this is a bottleneck in your program, code it in the approach that makes the most sense from a design standpoint.

I say map if all you are doing is assigning a value. My reason is that it is extensible without changing code, which your switch statement is not.

btw, how about an enum?

I think the generated code of the switch-case structure can grow quite large, if the amount of "codes" becomes large, in which case I think the stl::map is more appropriate.

Normally I'd suggest a map or lookup array or maybe even some hybrid lookup monster (assuming that you're optimizing for speed or code size rather than readability, of course), but it is worth noting that the new versions of GCC are smart enough to replace switch/case assignments like this to lookup tables. While this isn't so great for totally sparse keys it may be if you're keys are in clumps like: 0, 1, 2, 3, 4, 5, 11, 12, 13, 14, 15, 193, 194, 195, 196, 197, 198...

Of course if you could possibly fond some arithmatic to do the conversion, even better (usually). Never overlook shifting, swapping endianness, or more traditional math when doing something like this.

unordered_map probably will be better suited as here as it is using hash table and lookup will be really faster instead of switch.

Read the integers into an array/vector
sort the array/vector
use bsearch on the underlying array

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow