Question

I have a program that sorts big files by splitting them into chunks, sort chunks and merge them into final sorted file. Application runs one thread for loading/saving data from/to file - only one thread does I/O operations. Also there are two more threads that receive chunk data, sort it and then send sorted data back to thread that does I/O.

So in general there are 4 threads running - main thread, thread that loads/saves data and two threads that sort data.

I thought during execution i will see 1 sleeping thread (main) that doesn't take any CPU time and 3 active threads that utilize 1 CPU core each.

When i run this program on dual 6 core processor machine with hyper threading (24 CPUs) i see that ALL 24 CPU's are loaded for 100%!

Initially i thought that sort algorithm is mutithreaded, but after looking into java sources i found that it's not.

I'm using simple Collections.sort(LinkedList) to sort the data...

here are some details:

# java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

# uname -a
Linux 2.6.32-28-server #55-Ubuntu SMP Mon Jan 10 23:57:16 UTC 2011 x86_64 GNU/Linux

I was using nmon to monitor processor loading.

I would appreciate any explanation of this case and any advise on how to control CPU loading as i this particular task doesn't leave CPU time for other applications

[UPDATE] I used jvisualvm to count threads - it shows only threads i know about. Also i made a simple test program (see below) that runs only one main thread and got exactly the same results - all 24 processors are busy almost for 100% during code execution

public class Test {

    public void run(){
        Random r = new Random();
        int len = r.nextInt(10) + 5000000;
        LinkedList<String> list = new LinkedList<String>();
        for (int i=0; i<len; i++){
                list.add(new String("test" + r.nextInt(50000000)));
        }
        System.out.println("Inserted " + list.size() + " items");
        list.clear();
    }

    public static void main(String[] argv){
        Test t = new Test();
        t.run();
        System.out.println("Done");
    }
}

[UPDATE]
Here is the screenshot i made while running the program above (used nmon): http://imageshack.us/photo/my-images/716/cpuload.png/

Was it helpful?

Solution

I would suggest, that this is rather a nmon than a java question and to solve it, I would take a peek at the top command which provides info about cpu-usage per process. I predict the following result: You will see one java thread using near 100% cpu-time (which is ok, as per-process percentage in top is relative to one (virtual) core), maybe a second and third java thread with much less cpu-usage (the I/O threads). Depending on the choice of the gc you might even spot one or more gc-Threads, however much less than 20.

HotSpot however will not (and even cannot to my knowledge) parallelize a sequential task on its own.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top