Question

I made a program that, for a variety of methods of dividing ints and returning the double result, runs this method for every possible combination of numerators and denominators between 0 and 20, inclusive, and it gave these results for the amount of time it takes to run all of these combinations:

Casting numerator:*-                                       0.000029 ms average (2852 ms / 100000000 reps)
Multiplying numerator:*-                                   0.000594 ms average (5940 ms / 10000000 reps)
Making double variable out of numerator:                   0.002192 ms average (2192 ms / 1000000 reps)
Casting denominator:*+                                     0.002201 ms average (2201 ms / 1000000 reps)
Multiplying denominator:                                   0.002181 ms average (2181 ms / 1000000 reps)
Making double variable out of denominator:                 0.002183 ms average (2183 ms / 1000000 reps)
Passing double numerator:*--                               0.000041 ms average (408 ms / 10000000 reps)
Passing double denominator:*--                             0.000034 ms average (343 ms / 10000000 reps)

Casting numerator, dividing twice:                         0.002506 ms average (2506 ms / 1000000 reps)
Multiplying numerator, dividing twice:                     0.002505 ms average (2505 ms / 1000000 reps)
Making double variable out of numerator, dividing twice:   0.002517 ms average (2517 ms / 1000000 reps)
Casting denominator, dividing twice:                       0.002520 ms average (2520 ms / 1000000 reps)
Multiplying denominator, dividing twice:                   0.002517 ms average (2517 ms / 1000000 reps)
Making double variable out of denominator, dividing twice: 0.002508 ms average (2508 ms / 1000000 reps)
Passing double numerator, dividing twice:*-                0.000704 ms average (7040 ms / 10000000 reps)
Passing double denominator, dividing twice:*-              0.000594 ms average (5935 ms / 10000000 reps)

However, running these tests in a different order caused some of the tests to take the same amount of time (as expected), but others to take drastically different amounts of time (* indicates results that I got significantly different):

Casting numerator:                                         0.000029 ms average (2904 ms / 100000000 reps)
Casting denominator:                                       0.000726 ms average (7263 ms / 10000000 reps)
Multiplying numerator:                                     0.002220 ms average (2220 ms / 1000000 reps)
Multiplying denominator:                                   0.002224 ms average (2224 ms / 1000000 reps)
Making double variable out of numerator:                   0.002236 ms average (2236 ms / 1000000 reps)
Making double variable out of denominator:                 0.002249 ms average (2249 ms / 1000000 reps)
Passing double denominator:                                0.000036 ms average (3586 ms / 100000000 reps)
Passing double numerator:                                  0.001455 ms average (1455 ms / 1000000 reps)

Casting numerator, dividing twice:                         0.002542 ms average (2542 ms / 1000000 reps)
Casting denominator, dividing twice:                       0.002546 ms average (2546 ms / 1000000 reps)
Multiplying numerator, dividing twice:                     0.002545 ms average (2545 ms / 1000000 reps)
Multiplying denominator, dividing twice:                   0.002542 ms average (2542 ms / 1000000 reps)
Making double variable out of numerator, dividing twice:   0.002559 ms average (2559 ms / 1000000 reps)
Making double variable out of denominator, dividing twice: 0.002588 ms average (2588 ms / 1000000 reps)
Passing double denominator, dividing twice:                0.000700 ms average (7002 ms / 10000000 reps)
Passing double numerator, dividing twice:                  0.001581 ms average (1581 ms / 1000000 reps)

And also running a "control" test which loops through all of the possible combinations but doesn't actually divide anything produces even stranger results, with control doing as well as numberator casting used to be doing but numerator now doing 20 times worse, and the control done later taking much more time than previously:

Control (Don't actually divide anything):                  0.000035 ms average (3494 ms / 100000000 reps)
Casting numerator:                                         0.000588 ms average (5880 ms / 10000000 reps)
Casting denominator:                                       0.002177 ms average (2177 ms / 1000000 reps)
Multiplying numerator:                                     0.002188 ms average (2188 ms / 1000000 reps)
Multiplying denominator:                                   0.002202 ms average (2202 ms / 1000000 reps)
Making double variable out of numerator:                   0.002186 ms average (2186 ms / 1000000 reps)
Making double variable out of denominator:                 0.002201 ms average (2201 ms / 1000000 reps)
Passing double denominator:                                0.002024 ms average (2024 ms / 1000000 reps)
Passing double numerator:                                  0.001456 ms average (1456 ms / 1000000 reps)

Control (Don't actually divide anything):                  0.000927 ms average (927 ms / 1000000 reps)
Casting numerator, dividing twice:                         0.002552 ms average (2552 ms / 1000000 reps)
Casting denominator, dividing twice:                       0.002556 ms average (2556 ms / 1000000 reps)
Multiplying numerator, dividing twice:                     0.002538 ms average (2538 ms / 1000000 reps)
Multiplying denominator, dividing twice:                   0.002554 ms average (2554 ms / 1000000 reps)
Making double variable out of numerator, dividing twice:   0.002546 ms average (2546 ms / 1000000 reps)
Making double variable out of denominator, dividing twice: 0.002535 ms average (2535 ms / 1000000 reps)
Passing double denominator, dividing twice:                0.002344 ms average (2344 ms / 1000000 reps)
Passing double numerator, dividing twice:                  0.001597 ms average (1597 ms / 1000000 reps)

These results are consistent for re-running the same order. Changing the number of repetitions usually has no significant effect on the average speed, but in the case of running control 10^8 times and then running control 10^7 times, I found that running casting numerator next to perform equally well as control if it is run 10^7 times but to run 20x slower if run 10^6 or 10^8 times. Making the program wait for input and setting its process priority to Realtime in Windows made no significant difference. What is the JVM and/or the CPU doing differently to make these performances so different?

Update re: "Run a warm-up test"

Retrying the first order of tests and third order of tests where each test is run six times consecutively, each consecutive run of the same test takes about the same amount of time to run (+- 15% at most, insignificant vs the order-of-magnitude changes) with the exceptions of the control and casting the numerator, which take 0 ms to run on the 3rd thru 6th consecutive tests, and multiplying the numerator, which in the first order of tests takes 0ms to run on the 4th thru 6th tests. In addition, each test seems to take time approximately equal to the time it took to run that test once in the third order of tests. Somehow, running the warm-up tests worsens performance in many cases!

The test program

public class Tests {
    public static void main(String[] args) throws Exception {
        //number of repetitions set so that each takes 1-10 seconds total to run on my machine
        testPerformance(castNumerator, 8);
        testPerformance(multiplyNumerator, 7);
        testPerformance(makeNumerator, 6);
        testPerformance(castDenominator, 6);
        testPerformance(multiplyDenominator, 6);
        testPerformance(makeDenominator, 6);
        testPerformance(diDivider, 6);
        testPerformance(idDivider, 8);
        System.out.println();
        testPerformance(castNumerator2, 6);
        testPerformance(multiplyNumerator2, 6);
        testPerformance(makeNumerator2, 6);
        testPerformance(castDenominator2, 6);
        testPerformance(multiplyDenominator2, 6);
        testPerformance(makeDenominator2, 6);
        testPerformance(diDivider2, 6);
        testPerformance(idDivider2, 7);
    }

    static void testPerformance(final Divider divider, final int logReps) {
        final int reps = (int)Math.pow(10, logReps);
        final long startTime;
        if (divider instanceof IntIntDivider) {
            final IntIntDivider iiDivider = ((IntIntDivider)divider);
            startTime = System.currentTimeMillis();
            for (int i = 0; i < reps; i++) {
                for (int n = 0; n < 20; n++) {
                    for (int d = 0; d < 20; d++) {
                        iiDivider.divide(n, d);
                    }
                }
            }
        } else if (divider instanceof DoubleIntDivider) {//yucky repetition, but the only fair way to do it because generics can't do primitives
            final DoubleIntDivider diDivider = ((DoubleIntDivider)divider);
            startTime = System.currentTimeMillis();
            for (int i = 0; i < reps; i++) {
                for (int n = 0; n < 20; n++) {
                    for (int d = 0; d < 20; d++) {
                        diDivider.divide(n, d);
                    }
                }
            }
        } else if (divider instanceof IntDoubleDivider) {
            final IntDoubleDivider idDivider = ((IntDoubleDivider)divider);
            startTime = System.currentTimeMillis();
            for (int i = 0; i < reps; i++) {
                for (int n = 0; n < 20; n++) {
                    for (int d = 0; d < 20; d++) {
                        idDivider.divide(n, d);
                    }
                }
            }
        } else {
            throw new RuntimeException("Impossible divider");
        }
        final long endTime = System.currentTimeMillis();
        final long time = (endTime - startTime);
        System.out.printf("    %-58s %f ms%n", divider + ":", time / (double) reps); //cast reps to double because casting time might result in precision loss
        //System.out.println(divider + ":");
        //System.out.println("\t" + time + " ms taken for " + reps + " runs");
        //System.out.printf("\tAverage of %f ms%n", time / (double) reps);
    }
    static interface Divider {}
    static abstract class IntIntDivider implements Divider {
        public abstract double divide(int n, int d);
    }
    static abstract class DoubleIntDivider implements Divider {
        public abstract double divide(double n, int d);
    }
    static abstract class IntDoubleDivider implements Divider {
        public abstract double divide(int n, double d);
    }
    static final IntIntDivider control = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return 1;
        }
        @Override
        public String toString() {
            return "Control (Don't actually divide anything)";
        }
    };
    static final IntIntDivider castNumerator = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return ((double)n) / d;
        }
        @Override
        public String toString() {
            return "Casting numerator";
        }
    };
    static final IntIntDivider multiplyNumerator = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return n * 1.0 / d;
        }
        @Override
        public String toString() {
            return "Multiplying numerator";
        }
    };
    static final IntIntDivider makeNumerator = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            final double nDouble = n;
            return nDouble / d;
        }
        @Override
        public String toString() {
            return "Making double variable out of numerator";
        }
    };
    static final IntIntDivider castDenominator = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return n / (double) d;
        }
        @Override
        public String toString() {
            return "Casting denominator";
        }
    };
    static final IntIntDivider multiplyDenominator = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return n / (d * 1.0);
        }
        @Override
        public String toString() {
            return "Multiplying denominator";
        }
    };
    static final IntIntDivider makeDenominator = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            final double dDouble = d;
            return n / dDouble;
        }
        @Override
        public String toString() {
            return "Making double variable out of denominator";
        }
    };
    static final DoubleIntDivider diDivider = new DoubleIntDivider() {
        @Override
        public double divide(double n, int d) {
            return n / d;
        }
        @Override
        public String toString() {
            return "Passing double numerator";
        }
    };
    static final IntDoubleDivider idDivider = new IntDoubleDivider() {
        @Override
        public double divide(int n, double d) {
            return n / d;
        }
        @Override
        public String toString() {
            return "Passing double denominator";
        }
    };
    static final IntIntDivider castNumerator2 = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return ((double)n) / d + ((double)n) / d;
        }
        @Override
        public String toString() {
            return "Casting numerator, dividing twice";
        }
    };
    static final IntIntDivider multiplyNumerator2 = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return n * 1.0 / d + n * 1.0 / d;
        }
        @Override
        public String toString() {
            return "Multiplying numerator, dividing twice";
        }
    };
    static final IntIntDivider makeNumerator2 = new IntIntDivider() {

        @Override
        public double divide(int n, int d) {
            final double nDouble = n;
            return nDouble / d + nDouble / d;
        }
        @Override
        public String toString() {
            return "Making double variable out of numerator, dividing twice";
        }
    };
    static final IntIntDivider castDenominator2 = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return n / (double) d + n / (double) d;
        }
        @Override
        public String toString() {
            return "Casting denominator, dividing twice";
        }
    };
    static final IntIntDivider multiplyDenominator2 = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            return n / (d * 1.0) + n / (d * 1.0);
        }
        @Override
        public String toString() {
            return "Multiplying denominator, dividing twice";
        }
    };
    static final IntIntDivider makeDenominator2 = new IntIntDivider() {
        @Override
        public double divide(int n, int d) {
            final double dDouble = d;
            return n / dDouble + n / dDouble;
        }
        @Override
        public String toString() {
            return "Making double variable out of denominator, dividing twice";
        }
    };
    static final DoubleIntDivider diDivider2 = new DoubleIntDivider() {
        @Override
        public double divide(double n, int d) {
            return n / d + n / d;
        }
        @Override
        public String toString() {
            return "Passing double numerator, dividing twice";
        }
    };
    static final IntDoubleDivider idDivider2 = new IntDoubleDivider() {
        @Override
        public double divide(int n, double d) {
            return n / d + n / d;
        }
        @Override
        public String toString() {
            return "Passing double denominator, dividing twice";
        }
    };
}
Was it helpful?

Solution

I doubt the ordering has an effect here. The problem is that in the results where the ordering is changed, the number of times the tests were run is also changed. This is a problem because the more times the code is run, the more the JIT will optimize it.

If you look at just the numbers there's a pretty strong pattern that the tests that were run more times were faster. This is pretty classic JIT behavior. The optimized compilation gets run more times and weights the average. So it should be clear that the number of times the individual tests are run needs to be equal. Also the test should account for warm up.

So adjusting main gets something like this:

for(int i = 0; i < 2; i++) {
    testPerformance(castNumerator, 6);
    testPerformance(multiplyNumerator, 6);
    testPerformance(makeNumerator, 6);
    testPerformance(castDenominator, 6);
    testPerformance(multiplyDenominator, 6);
    testPerformance(makeDenominator, 6);
    testPerformance(diDivider, 6);
    testPerformance(idDivider, 6);
    System.out.println();
    testPerformance(castNumerator2, 6);
    testPerformance(multiplyNumerator2, 6);
    testPerformance(makeNumerator2, 6);
    testPerformance(castDenominator2, 6);
    testPerformance(multiplyDenominator2, 6);
    testPerformance(makeDenominator2, 6);
    testPerformance(diDivider2, 6);
    testPerformance(idDivider2, 6);
    System.out.println();
}

For warm up, this just runs the tests twice and the first set of results are probably not very meaningful.

But after doing that I see something like the following for the second run:

Casting numerator:                                         0.003012 ms
Multiplying numerator:                                     0.003011 ms
Making double variable out of numerator:                   0.003025 ms
Casting denominator:                                       0.003040 ms
Multiplying denominator:                                   0.003015 ms
Making double variable out of denominator:                 0.003006 ms
Passing double numerator:                                  0.000000 ms
Passing double denominator:                                0.000000 ms

Well that's kind of strange because that indicates the last two tests took 0 ms to run. And it's not because those divisions are really just very fast because changing the program so it uses nanoTime has no effect. (nanoTime is always recommended though here it probably won't make much of a difference since the times are long enough.)

Actually probably what is happening is the JIT has figured out you're ignoring return values so it's optimized those tests by not doing them somehow. As for why it only does it for certain tests, your guess is probably as good as mine, but it indicates a serious problem. If the JIT is doing this kind of extreme optimization, we have no idea what it's doing elsewhere.

This can be solved by just doing something (basically anything) with the return value:

// somewhere
static long result;

// for each op
result += idDivider.divide(n, d);

// at the end of the test
System.out.println(result);

This might slow down the tests a little but it will foil the JIT's cleverness.

After all those changes I get basically this every time:

Casting numerator:                                         0.007088 ms
Multiplying numerator:                                     0.007135 ms
Making double variable out of numerator:                   0.007162 ms
Casting denominator:                                       0.007180 ms
Multiplying denominator:                                   0.007206 ms
Making double variable out of denominator:                 0.007173 ms
Passing double numerator:                                  0.003650 ms
Passing double denominator:                                0.003663 ms

Casting numerator, dividing twice:                         0.007554 ms
Multiplying numerator, dividing twice:                     0.007574 ms
Making double variable out of numerator, dividing twice:   0.007538 ms
Casting denominator, dividing twice:                       0.007550 ms
Multiplying denominator, dividing twice:                   0.007503 ms
Making double variable out of denominator, dividing twice: 0.007577 ms
Passing double numerator, dividing twice:                  0.003765 ms
Passing double denominator, dividing twice:                0.003798 ms

Now if I had to guess at why this indicates it's faster to cast as a parameter, I would say HotSpot is compiling it so that the loop variable is always a double.

On benchmarking in general I'd recommend the following further reading:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top