Question

I am writing a set of visual interfaces to data structures in Java. The idea is that these classes should be high performance implementations of algorithms, but with embedded hooks so that the algorithms can be displayed interactively.

There are a number of reasons to do this, but if you will accept this request at face value, I want to embed calls in the middle of algorithms to identify that a particular subsection has just been done. For example, one pass of a sorting algorithm.

I would prefer that the library be both very efficient and allow this. In C++, I would plug in two different templates or use conditional compilation, and two versions of the code could reasonably be generated. Is there any way to achieve this in Java? I'm hoping someone else can come up with one because I can't.

NEWS FLASH. I tried this actual code.

For n=100,000 an insertion sort takes approximately 9800 ms with VISUALIZE as a static variable but not final, vs. about 3100 with it commented out. So the performance loss is unacceptable.

With visualize as static final, the optimizer does indeed detect it and chop it out, but given that it's final what can I do with it? I can't dynamically turn visualization on and off!

public class TestSort {
    private static boolean VISUALIZE = false; 
    private static ArrayObserver ao;
    public static void insertionSort(int[] x) {
        for (int i = 1; i < x.length; i++) {
            int temp = x[i];
            int j = i - 1;
            if (x[j] > temp) {
                do {    
                    x[j+1] = x[j];
/*              if (VISUALIZE) {
                        ao.compare(i-1, i);
            ao.copy(i-1, i);
            }*/
                } while (--j >= 0 && x[j] > temp);
                x[j+1] = temp;
            }
        }
    }
    static Random r = new Random();
    static int[] createRandomArray(int n) {
    int[] x = new int[n];
    for (int i = 0; i < x.length; i++)
        x[i] = r.nextInt(100);
        return x;
    }
    static void display(int[] x) {
    for (int i = 0; i < x.length; i++)
        System.out.print(x[i] + " ");
        System.out.println();
    }
    public static void main(String args[]) {
    //int[] x = {9, 8, 7, 6, 5, 4, 3, 2, 1};
    int [] x = createRandomArray(100000);
    ao = new ArrayObserver(x);
    if (x.length < 20) display(x);
    long t0 = System.currentTimeMillis();
    insertionSort(x);
    long t1 = System.currentTimeMillis();
    if (x.length < 20) display(x);
    System.out.println(t1-t0);  
    }
}
Was it helpful?

Solution

I think you have a few options:

  • Put a test in on a static final constant (e.g. a boolean), which conditionally executes the the visual interface code. The JVM will eliminate the dead code when you set the constant to "off", and your code will be fully optimised. The downside is, of course, that you can only switch this at compile time, but this may be fine if you actually want to build two copies of the library.
  • Add an extra parameter to the function that determines whether the visual interface is called, and test this where necessary in your algorithm. This will add a small amount of runtime overhead, but may well be acceptable. I suggest you benchmark this, in my experience though such tests on a local variable are usually sufficiently cheap that you can get away with it (in CPU terms, will probably be just a register test which is likely to be even cheaper than the cost of a single memory access into an int[] array...)
  • Use a higher level / meta-language to express the algorithms, and use code-generation techniques to generate the actual code you need (with or without). I've done stuff like this in Clojure, for example. It's also an option to generate bytecode directly with tools like ASM (if all you care about is execution, and don't need the Java source code).
  • Use a textual pre-processor. It should work fine for Java code generation (as it does for C/C++), though it's not such a common approach and you may find it a bit fragile. You may need to do some clever stuff like generating different class names in the file system etc.

OTHER TIPS

Contrary to e.g. C++, the final compilation to native machine code is done at runtime, in this case completely eliminating the need for building two separate versions for performance reasons.

If you pass the boolean to enable/disable the extra calls as a parameter to the constructor of the class that implements your algorithm and store it in a final class variable (i.e. a constant), when the algorithm gets executed in a tight loop (= a 'hot spot') the Hotspot VM will compile the class instance and remove the dead code. This kind of runtime optimizations can't be done with C++.

But note that a boolean test is likely to cost only a very small fraction of the total algorithm.

EDIT: your tests have shown that this doesn't work, although I'm not sure they are done correctly. You are not using any benchmarking framework. The most aggressive optimizations will happen with the server VM (-server) and then code must be properly warmed up first (the first 10000 or so iterations will happen uncompiled which is of course much slower). Also, using a template pattern may have better chance of being optimized than a final boolean, as the boolean check is cheap anyway and the compiler is known to do virtual call inlining (as far as I know).

EDIT2: if you don't need to switch at runtime (after all conditional compilation and separate builds won't help you there either) just go with the static final boolean which you know gets optimized. Initialize it with the value from a command line argument or config file and you can easily switch between both versions at application start time.

If you do:

private static final ENABLED = false;

// Then later...
if(ENABLED){
    call();
}

The whole if-block will not be even included in the generated bytecode (at least on newer JVMs). Would that be an option?

The reason why Java doesn't have a preprocessor is to prevent programmers from doing exactly what you are trying to do. Instead, you should write Java code to implement the functionality you would otherwise implement in the preprocessor directives and let the compiler optimize the generated code. This way, you will end up with code written in a single language (instead of two languages, Java and preprocessor DSL), making things a lot easier for the toolchain you use when analyzing your code.

One way to solve your problem in pure Java code is to use the Template method pattern.

As this guy claims to prove, it is possible to modify the bytecode to set a final variable on runtime. I guess you could use the same aproach to set static final VISUALIZE on and off.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top