Question

I have a very simple unit test that just allocates a lot of Strings:

public class AllocationSpeedTest extends TestCase {

    public void testAllocation() throws Exception {

        for (int i = 0; i < 1000; i++) {
            long startTime = System.currentTimeMillis();
            String a = "dummy";
            for (int j = 0; j < 1000; j++) {
                a += "allocation driven";
            }
            System.out.println(i + ": " + (System.currentTimeMillis() - startTime) + "ms " + a.length());
        }

    }

}

On my Windows PC (Intel Core Duo, 2.2GHz, 2GB) this prints on average:

...
71: 47ms 17005
72: 47ms 17005
73: 46ms 17005
74: 47ms 17005
75: 47ms 17005
76: 47ms 17005
77: 47ms 17005
78: 47ms 17005
79: 47ms 17005
80: 62ms 17005
81: 47ms 17005
...

On SunOS (5.10 Generic_138888-03 sun4v sparc SUNW, SPARC-Enterprise-T5120):

...
786: 227ms 17005
787: 294ms 17005
788: 300ms 17005
789: 224ms 17005
790: 260ms 17005
791: 242ms 17005
792: 263ms 17005
793: 287ms 17005
794: 219ms 17005
795: 279ms 17005
796: 278ms 17005
797: 231ms 17005
798: 291ms 17005
799: 246ms 17005
800: 327ms 17005
...

JDK version is 1.4.2_18 on both machines. JVM parameters are the same and are:

–server –Xmx256m –Xms256m

Can anyone explain why SUN super server is slower?

(http://www.sun.com/servers/coolthreads/t5120/performance.xml)

Was it helpful?

Solution

The CPU is indeed slower on SPARC (1.2Ghz) and as answered by one of the Sun's engineers T2 is usualy 3 times slower for single-threaded application than modern Intel processors. Though, he also stated that in a multi-threaded environment SPARC should be faster.

I have made a multi-threaded test using GroboUtils library and tested both allocations (through concatenations) and simple calculations ( a += j*j ) to test processor. And I've got the following results:

1 thread : Intel : Calculations test : 43ms
100 threads : Intel : Calculations test : 225ms

1 thread : Intel : Allocations test : 35ms
100 threads : Intel : Allocations test : 1754ms

1 thread : SPARC : Calculations test : 197ms
100 threads : SPARC : Calculations test : 261ms

1 thread : SPARC : Allocations test : 236ms
100 threads : SPARC : Allocations test : 1517ms

SPARC shows its power here by outperforming Intel on 100 threads.

Here goes the multi-threaded calculation test:

import java.util.ArrayList;
import java.util.List;

import net.sourceforge.groboutils.junit.v1.MultiThreadedTestRunner;
import net.sourceforge.groboutils.junit.v1.TestRunnable;
import junit.framework.TestCase;

public class TM1_CalculationSpeedTest extends TestCase {

    public void testCalculation() throws Throwable {

        List threads = new ArrayList();
        for (int i = 0; i < 100; i++) {
            threads.add(new Requester());
        }
        MultiThreadedTestRunner mttr = new MultiThreadedTestRunner((TestRunnable[]) threads.toArray(new TestRunnable[threads.size()]));
        mttr.runTestRunnables(2 * 60 * 1000);

    }

    public class Requester extends TestRunnable {

        public void runTest() throws Exception {
            long startTime = System.currentTimeMillis();
            long a = 0;
            for (int j = 0; j < 10000000; j++) {
                a += j * j;
            }
            long endTime = System.currentTimeMillis();
            System.out.println(this + ": " + (endTime - startTime) + "ms " + a);
        }

    }

}

Here goes the multi-threaded allocation test:

import java.util.ArrayList;
import java.util.List;

import junit.framework.TestCase;
import net.sourceforge.groboutils.junit.v1.MultiThreadedTestRunner;
import net.sourceforge.groboutils.junit.v1.TestRunnable;

public class TM2_AllocationSpeedTest extends TestCase {

    public void testAllocation() throws Throwable {

        List threads = new ArrayList();
        for (int i = 0; i < 100; i++) {
            threads.add(new Requester());   
        }
        MultiThreadedTestRunner mttr = new MultiThreadedTestRunner((TestRunnable[]) threads.toArray(new TestRunnable[threads.size()]));
        mttr.runTestRunnables(2 * 60 * 1000);

    }

    public class Requester extends TestRunnable {

        public void runTest() throws Exception {
            long startTime = System.currentTimeMillis();
            String a = "dummy";
            for (int j = 0; j < 1000; j++) {
                a += "allocation driven";
            }
            long endTime = System.currentTimeMillis();
            System.out.println(this + ": " + (endTime - startTime) + "ms " + a.length());
        }

    }

}

OTHER TIPS

It's my understanding that UltraSPARC T2-based machines are aimed at performance-per-watt rather than raw performance. You might try dividing the allocation time by the power consumption and see what kind of numbers you get. :)

Is there a reason you're running 1.4.2 instead of 1.6?

The SunOS hardware is slower, and the vm may be somewhat slower as well.

I don't think that this is measuring memory allocation. For a start, there is an awful lot of character copying going on in a += "allocation driven";. But I suspect that the real bottleneck is in getting the output from System.out.println(...) through the network layers from the app on the Sun server to your remote workstation.

As an experiment, try multiplying the inner loop count by 10 and 100, and see if that "speeds up" the Sun server relative to your workstation.

Another thing you could try is to move the inner loop into a separate procedure. It is possible that since you are doing all the work in one invocation of main, the JIT compiler never gets a chance to compile it.

(Artificial "micro-benchmarks" like this are always susceptible to effects like these. I tend to distrust them.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top