Question

I have a piece of code with a) which I replaced with b) purely for legibility ...

a)

if ( WORD[ INDEX ] == 'A' ) branch = BRANCH.A;
/* B through to Y */
if ( WORD[ INDEX ] == 'Z' ) branch = BRANCH.Z;

b)

switch ( WORD[ INDEX ] ) {
    case 'A' : branch = BRANCH.A; break;
    /* B through to Y */
    case 'Z' : branch = BRANCH.Z; break;
}


... will the switch version cascade through all the permutations or jump to a case ?



EDIT:

Some of the answers below regard alternative approaches to the approach above.
I have included the following to provide context for its use.

The reason I asked, the Question above, was because the speed of adding words empirically improved.

This isn't production code by any means, and was hacked together quickly as a PoC.

The following seems to be a confirmation of failure for a thought experiment.
I may need a much bigger corpus of words than the one I am currently using though.
The failure arises from the fact I did not account for the null references still requiring memory. ( doh ! )

public class Dictionary {
    private static Dictionary ROOT;
    private boolean terminus;
    private Dictionary A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z;
    private static Dictionary instantiate( final Dictionary DICTIONARY ) {
        return ( DICTIONARY == null ) ? new Dictionary() : DICTIONARY;
    }
    private Dictionary() {
        this.terminus = false;
        this.A = this.B = this.C = this.D = this.E = this.F = this.G = this.H = this.I = this.J = this.K = this.L = this.M = this.N = this.O = this.P = this.Q = this.R = this.S = this.T = this.U = this.V = this.W = this.X = this.Y = this.Z = null;
    }
    public static void add( final String...STRINGS ) {
        Dictionary.ROOT = Dictionary.instantiate( Dictionary.ROOT );
        for ( final String STRING : STRINGS ) Dictionary.add( STRING.toUpperCase().toCharArray(), Dictionary.ROOT , 0, STRING.length() - 1 );
    }
    private static void add( final char[] WORD, final Dictionary BRANCH, final int INDEX, final int INDEX_LIMIT ) {
        Dictionary branch = null;
        switch ( WORD[ INDEX ] ) {
        case 'A' : branch = BRANCH.A = Dictionary.instantiate( BRANCH.A ); break;
        case 'B' : branch = BRANCH.B = Dictionary.instantiate( BRANCH.B ); break;
        case 'C' : branch = BRANCH.C = Dictionary.instantiate( BRANCH.C ); break;
        case 'D' : branch = BRANCH.D = Dictionary.instantiate( BRANCH.D ); break;
        case 'E' : branch = BRANCH.E = Dictionary.instantiate( BRANCH.E ); break;
        case 'F' : branch = BRANCH.F = Dictionary.instantiate( BRANCH.F ); break;
        case 'G' : branch = BRANCH.G = Dictionary.instantiate( BRANCH.G ); break;
        case 'H' : branch = BRANCH.H = Dictionary.instantiate( BRANCH.H ); break;
        case 'I' : branch = BRANCH.I = Dictionary.instantiate( BRANCH.I ); break;
        case 'J' : branch = BRANCH.J = Dictionary.instantiate( BRANCH.J ); break;
        case 'K' : branch = BRANCH.K = Dictionary.instantiate( BRANCH.K ); break;
        case 'L' : branch = BRANCH.L = Dictionary.instantiate( BRANCH.L ); break;
        case 'M' : branch = BRANCH.M = Dictionary.instantiate( BRANCH.M ); break;
        case 'N' : branch = BRANCH.N = Dictionary.instantiate( BRANCH.N ); break;
        case 'O' : branch = BRANCH.O = Dictionary.instantiate( BRANCH.O ); break;
        case 'P' : branch = BRANCH.P = Dictionary.instantiate( BRANCH.P ); break;
        case 'Q' : branch = BRANCH.Q = Dictionary.instantiate( BRANCH.Q ); break;
        case 'R' : branch = BRANCH.R = Dictionary.instantiate( BRANCH.R ); break;
        case 'S' : branch = BRANCH.S = Dictionary.instantiate( BRANCH.S ); break;
        case 'T' : branch = BRANCH.T = Dictionary.instantiate( BRANCH.T ); break;
        case 'U' : branch = BRANCH.U = Dictionary.instantiate( BRANCH.U ); break;
        case 'V' : branch = BRANCH.V = Dictionary.instantiate( BRANCH.V ); break;
        case 'W' : branch = BRANCH.W = Dictionary.instantiate( BRANCH.W ); break;
        case 'X' : branch = BRANCH.X = Dictionary.instantiate( BRANCH.X ); break;
        case 'Y' : branch = BRANCH.Y = Dictionary.instantiate( BRANCH.Y ); break;
        case 'Z' : branch = BRANCH.Z = Dictionary.instantiate( BRANCH.Z ); break;
        }   
        if ( INDEX == INDEX_LIMIT ) branch.terminus = true;
        else Dictionary.add( WORD, branch, INDEX + 1, INDEX_LIMIT );
    }
    public static boolean is( final String STRING ) {
        Dictionary.ROOT = Dictionary.instantiate( Dictionary.ROOT );
        return Dictionary.is( STRING.toUpperCase().toCharArray(), Dictionary.ROOT, 0, STRING.length() - 1 );
    }
    private static boolean is( final char[] WORD, final Dictionary BRANCH, final int INDEX, final int INDEX_LIMIT ) {
        Dictionary branch = null;
        switch ( WORD[ INDEX ] ) {
        case 'A' : branch = BRANCH.A; break;
        case 'B' : branch = BRANCH.B; break;
        case 'C' : branch = BRANCH.C; break;
        case 'D' : branch = BRANCH.D; break;
        case 'E' : branch = BRANCH.E; break;
        case 'F' : branch = BRANCH.F; break;
        case 'G' : branch = BRANCH.G; break;
        case 'H' : branch = BRANCH.H; break;
        case 'I' : branch = BRANCH.I; break;
        case 'J' : branch = BRANCH.J; break;
        case 'K' : branch = BRANCH.K; break;
        case 'L' : branch = BRANCH.L; break;
        case 'M' : branch = BRANCH.M; break;
        case 'N' : branch = BRANCH.N; break;
        case 'O' : branch = BRANCH.O; break;
        case 'P' : branch = BRANCH.P; break;
        case 'Q' : branch = BRANCH.Q; break;
        case 'R' : branch = BRANCH.R; break;
        case 'S' : branch = BRANCH.S; break;
        case 'T' : branch = BRANCH.T; break;
        case 'U' : branch = BRANCH.U; break;
        case 'V' : branch = BRANCH.V; break;
        case 'W' : branch = BRANCH.W; break;
        case 'X' : branch = BRANCH.X; break;
        case 'Y' : branch = BRANCH.Y; break;
        case 'Z' : branch = BRANCH.Z; break;
        }
        if ( branch == null ) return false;
        if ( INDEX == INDEX_LIMIT ) return branch.terminus;
        else return Dictionary.is( WORD, branch, INDEX + 1, INDEX_LIMIT );
    }
}
Was it helpful?

Solution

In bytecode there are two forms of switch: tableswitch and lookupswitch. One assumes a dense set of keys, the other sparse. See the description of compiling switch in the JVM spec. For enums, the ordinal is found and then the code continues as the int case. I am not entirely sure how the proposed switch on String little feature in JDK7 will be implemented.

However, heavily used code is typically compiled in any sensible JVM. The optimiser is not entirely stupid. Don't worry about it, and follow the usual heuristics for optimisation.

OTHER TIPS

Don't worry about performance; use the syntax that best expresses what you're doing. Only after you have (a) demonstrated a performance deficiency; and (b) localized it to the routine in question, only then should you worry about performance. For my money, the case syntax is more appropriate here.

It looks like you have enumerated the values so perhaps an enumeration is in order?

enum BRANCH {
  A,B, ... Y,Z;
}

Then in your code :

BRANCH branch = BRANCH.valueOf( WORD[ INDEX ] );

Also, there is a possible bug in your code where "A" == "A" may be false depending on the object identity of the "A"'s.

Not quite an answer to your question, but there's actually a bug in your code, you should have a break after each case:

switch ( WORD[ INDEX ] ) {
    case 'A' : branch = BRANCH.A; break;
    /* B through to Y */
    case 'Z' : branch = BRANCH.Z; break;
}

I don't think performance differences are going to be too significant here, but if you really care about performance, and if this code is executed very frequently, here's another option:

// Warning, untested code.
BRANCH[] branchLookUp = {BRANCH.A, BRANCH.B, ..., BRANCH.Z};

branch = branchLookUp[WORD[INDEX] - 'A'];

Be sure to encapsulate this and document it well though.

Honestly, I don't think the performance matters in this case. It's really up to the compiler and its optimization.

If you have a switch statement with consecutive, integral values, depending on the language, it may be optimized to a branch table, which is very quick. It's not slower, anyway!

Additionally, using if/else if would be an improvement over if, for cases such as this one in which your cases are mutually exclusive. No sense making 25 more checks after matching A.

But basically, any performance difference is negligible, and you ought to use the most correct syntax, which in this case is the switch statement. Make sure to separate your cases with breaks though.

Here's a way to avoid all of the case statements.

import java.util.HashMap;

public class Dictionary {
    private static Dictionary                       ROOT;
    private boolean                                 terminus;
    private final HashMap<Character, Dictionary>    dictionaries    = new HashMap<Character, Dictionary>();

    private void ensureBranch(char c) {
        if (getBranch(c) != null)
            return;
        dictionaries.put(c, new Dictionary());
    }

    private Dictionary getBranch(char c) {
        return dictionaries.get(c);
    }

    public static boolean is(final String string) {
        ensureRoot();
        return is(chars(string), ROOT, 0, string.length() - 1);
    }

    public static void add(final String... strings) {
        ensureRoot();
        for (final String string : strings)
            add(chars(string), ROOT, 0, string.length() - 1);
    }

    private static void ensureRoot() {
        if (ROOT == null)
            ROOT = new Dictionary();
    }

    private static char[] chars(final String string) {
        return string.toUpperCase().toCharArray();
    }

    private Dictionary() {
        this.terminus = false;
    }

    private static void add(final char[] word, final Dictionary dictionary, final int index, final int limit) {
        Dictionary branch = getBranch(word, dictionary, index);
        if (index == limit)
            branch.terminus = true;
        else
            add(word, branch, index + 1, limit);
    }

    private static Dictionary getBranch(final char[] word, final Dictionary dictionary, final int index) {
        final char c = word[index];
        dictionary.ensureBranch(c);
        return dictionary.getBranch(c);
    }

    private static boolean is(final char[] word, final Dictionary dictionary, final int index, final int limit) {
        Dictionary branch = dictionary.getBranch(word[index]);
        if (branch == null)
            return false;
        if (index == limit)
            return branch.terminus;
        return is(word, branch, index + 1, limit);
    }
}

I know this is not what you are asking at all, but aren't you just doing this?

public class Dictionary {
    private static final Set<String> WORDS = new HashSet<String>();

    public static void add(final String... STRINGS) {
        for (String str : STRINGS) {
            WORDS.add(str.toUpperCase());
        }
    }

    public static boolean is(final String STRING) {
        return WORDS.contains(STRING.toUpperCase());
    }
}

Are you simply looking for something a little more memory efficient?

The switch statement should use a hash to select which case to go to. From there, every subsequent case will also be run if there are no break statements. For example, with your code, if you switch on X, it will immediately go to X, then Y, then Z. See the Java Tutorial.

The switch should be logarithmic and the if linear, assuming the compiler can't find anything clever. But long switches are tricky to read and also bug-prone -- as noted, the switch you've got above doesn't have any breaks, and it's going to fall through all the cases.

Why not prepopulate a Map instead, and just use Map.get()?

private static final Map<Char, Whatever> BRANCHES = Collections.unmodifiableMap(new HashMap<Char, Whatever>() {{
    put('A', BRANCH.A);
    ...
    put('Z', BRANCH.Z);
}}

public void getBranch(char[] WORD, int INDEX) {
    return BRANCHES.get(WORD[INDEX]);
}

As noted above, if BRANCH is an Enum, this behavior should properly be in the Enum.

(What are WORD, INDEX and BRANCH here, anyway? From the names, they should be constants, but you can't really have constant arrays -- the contents are alwyas modifiable; there wouldn't be much point in creating a constant "struct"; and there certainly wouldn't be much point in iffing or switching based on constants....)

I think this is more of a question about style rather than performance. I think in this case switch statement is more appropriate than if. I'm not sure there is much difference in performance.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top