Question

In my free time i recently made a framework for multi-threaded math operations, and to test it i calculated the first couble of thousand prime numbers.

But i needed it to take more time, so i inserted this code into the prime calculation:

for (int i = 0; i < 1000000; i++)
{
    // Nothing.
}

For a long time, i write and compiled the code on a 64bit machine, and tested it on a number of 32 bit machines.

Then i ran it on a 64 bit machine, and noticed a massive performance difference.

With the same code, a completely similar 64 machine takes <100ms to do, what a 32 machine uses ~52000ms to do (2 virtual machine on the same host).

I've tested on Windows and Ubuntu on different computers, and using the same .class file, i still get this massive 32bit vs 64bit difference.

Here is a quick code that you can use the replicate the performance difference.

import java.util.ArrayList;
import java.util.Collection;
public class Test {
public static void main(String[] args)
{
    long start = System.currentTimeMillis();
    int j = 2;
    ArrayList<Integer> res = new ArrayList<Integer>();
    for (int k = 0; k < 50000; k++)
    {
        Collection<Integer> partres = work(k);
        if (partres != null)
            res.addAll(work(k));
    }
    long end = System.currentTimeMillis();
    System.out.println("Done in " + (end-start) + " ms.");
}
public static Collection<Integer> work(Integer j) {
    for (int i = 0; i < 1000000; i++)
    {
        // Nothing.
    }
    if (isPrime(j))
    {
        ArrayList<Integer> res = new ArrayList<Integer>();
        res.add(j);
        return res;
    }
    else
        return null;
}
static boolean isPrime(int n) {
    if (n == 2) return true;
    if (n%2==0) return false;
    for(int i = 3; i * i <= n; i += 2) 
        if(n%i==0)
            return false;
    return true;
}
}

And here is the .class file i compiled it to.

Now my question.

I know that there is a performance gain by using a 64bit machine, but that doesn't explain this massive difference. So does anybody have any idea why this is happening?

Was it helpful?

Solution

On windows, it will use the -client JVM by default for 32-bit and -server for the 64-bit JVM. The server JVM is more aggressive at removing code which doesn't do anything. e.g. empty loops. You will find such a loop takes about the same amount of time regardless of the count limit because it is depend on the amount of time it takes to detect and eliminate the loop. Try adding a second timed loop to the same method and you will find it takes almost no time regardless of what you set the maximum value to (assuming its not an infinite loop) This is because the method will be compiled by the time the second loop starts.

http://docs.oracle.com/javase/1.5.0/docs/guide/vm/server-class.html

BTW: I would use nanoTime and run your tests repeately for atleast a couple of seconds.

OTHER TIPS

64 bit Java always uses the -server JIT compiler, while your 32 bit JVM was probably using the -client JIT compiler.

When the C2 aka. -server compiler sees something like this:

for (int i = 0; i < 1000000; i++)
{
  // Nothing.
}

It will notice that the loop does nothing, and removes it! Your loop that does nothing, will be optimised into nothing.

To foil that optimisation, you will have to make the loop do something — it could XOR all those i's together, for instance — and make use of the result. Then the loop will look like real work to the compiler, and the code will be kept.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top