Question

I am working on a project that is both memory and computationally intensive. A significant portion of the execution utilizes multi-threading by a FixedThreadPool. In short; I have 1 thread for fetching data from several remote locations (using URL connections) and populating a BlockingQueue with objects to be analyzed and n threads that pick these objects and run the analysis. edit: see code below

Now this setup works like a charm on my Linux machine running OpenSUSE 11.3, but a colleague is testing it on a very similar machine running Win7 is getting custom notifications of timeouts on the queue polling (see code below), lots of them actually. I have been trying to monitor the processor use on her machine, and it appears that the software does not get any more than 15% of the CPUs while on my machine the processor usage hits the roof, just as I intended.

My question is, then, can this be a sign of "starvation" of the queue? Could it be so that the producer thread is not getting enough cpu time? If so how do I go about giving one particular thread in the pool higher priority?

UPDATE: I have been trying to pinpoint the problem, with no joy... I did however gain some new insights.

  • Profiling the execution of the code with JVisualVM demonstrates a very peculiar behavior. The methods are called in short bursts of CPU-time with several seconds of no progress in between. This to me means that somehow the OS is hitting the brakes on the process.

  • Disabling the anti-virus and back-up daemons do not have any significant affect on the matter

  • Changing the priority of java.exe (the only instance) through task manager (adviced here) does not change anything either. (That being said, I could not give "realtime" priority to java, and had to be content with "high" prio)

  • Profiling the network usage shows good flow of data in and out, so I am guessing that is not the bottleneck (while it is a considerable part of the execution time of the process, but that I know already and is pretty much the same percentage as what I get on my Linux machine).

Any ideas as to how the Win7 OS might be limiting the cpu time to my project? if it's not the OS, what could be the limiting factor? I would like to stress yet again that the machine is NOT running any other computation intensive at the same time and there is almost no load on the cpus other than my software. This is driving me crazy...

EDIT: relevant code

public ConcurrencyService(Dataset d, QueryService qserv, Set<MyObject> s){

    timeout = 3;
    this.qs = qserv;
    this.bq = qs.getQueue();
    this.ds = d;
    this.analyzedObjects = s;
    this.drc = DebugRoutineContainer.getInstance();
    this.started = false;

    int nbrOfProcs = Runtime.getRuntime().availableProcessors();
    poolSize = nbrOfProcs;
    pool = (ThreadPoolExecutor) Executors.newFixedThreadPool(poolSize);
    drc.setScoreLogStream(new PrintStream(qs.getScoreLogFile()));
}

public void serve() throws InterruptedException {
    try {
        this.ds.initDataset();
        this.started = true;
        pool.execute(new QueryingAction(qs));
        for(;;){
            MyObject p = bq.poll(timeout, TimeUnit.MINUTES);

            if(p != null){
                if (p.getId().equals("0"))
                    break;

                pool.submit(new AnalysisAction(ds, p, analyzedObjects, qs.getKnownAssocs()));
            }else 
                drc.log("Timed out while waiting for an object...");

        }

      } catch (Exception ex) {
            ex.printStackTrace();
            String exit_msg = "Unexpected error in core analysis, terminating execution!";

      }finally{
            drc.log("--DEBUG: Termination criteria found, shutdown initiated..");
            drc.getMemoryInfo(true);    // dump meminfo to log

            pool.shutdown();

            int mins = 2;
            int nCores = poolSize;
            long    totalTasks = pool.getTaskCount(), 
                    compTasks = pool.getCompletedTaskCount(),
                    tasksRemaining = totalTasks - compTasks,
                    timeout = mins * tasksRemaining / nCores;

            drc.log("--DEBUG: Shutdown commenced, thread pool will terminate once all objects are processed, " +
                        "or will timeout in : " + timeout + " minutes... \n" + compTasks + " of " +  (totalTasks -1) + 
                        " objects have been analyzed so far, " + "mean process time is: " +
                        drc.getMeanProcTimeAsString() + " milliseconds.");

            pool.awaitTermination(timeout, TimeUnit.MINUTES);
      }

}

The class QueryingAction is a simple Runnable that calls the data acquisition method in the designated QueryService object which then populates a BlockingQueue. The AnalysisAction class does all the number-crunching for a single instance of MyObject.

Was it helpful?

Solution 6

So after weeks of fiddling, wrestling in code and other types of suffering I think I had a breakthrough, "a moment of clarity" if you will...

I managed to show that the program can exhibits the same slow behavior on my Linux machine and can indeed run full throttle on the problematic Win-7 machine. The crux of the problem appears to be some sort of corruption of the system/cache files that are used to store the results of previous queries, and overall, speed up the analysis. You got to love the irony, in this case they appeared to be the reason for EXTREME slow analysis. In retrospect, I should have known (a la Occam's razor)...

I am still not sure what how the corruption occurs, but at least it's probably not related to different OS. Using the system files from my machine increases the output on the Win7 host up to about 40% only however. Profiling the process more has also revealed that, oddly enough, there is significantly more GC activity on Win7, which apparently took lots of CPU time from number crunching. Giving -Xmx2g takes care of excessive garbage collection and the CPU usage for the process shoots up to 95-96%, and threads run smoothly.

Now that my original question is answered, I have to say that overall java responsiveness is definitely better on Linux environment, even without allocating more heap memory, I can easily multi-task while I am running an extensive analysis in the background. Things are not as smooth in Win-7, e.x. resizing the GUI is significantly slow once the analysis takes off at full speed.

Thanks for all the replies, I am sorry for the partially misleading problem description. I merely shared what I found out while debugging to the best of my abilities. Anyways, I believe the bounty goes to Peter Lawrey, since he early on pointed to an I/O issue and it was his suggestion about a logger thread which eventually led me to the answer.

OTHER TIPS

I suspect the producer thread is not getting/loading the source data fast enough. This might not be a lack of CPU but an IO related issue. (not sure why you have time outs on your BlockingQueue)

It might be worth having a thread which periodically logs things like the number of tasks added and the length of the queue (e.g. every 5-15 seconds)

So, if I correctly understand your problem, you have one thread to fetch data, and several threads to analyse the fetched data. Your problem is that the threads are not correctly synchronized to run together and take full advantage of the processor.

You have a tipical producer-consumer problem with a single producer and several consumers. I advise you to remake your code a bit to have, instead, several independent consumer threads that are always waiting for resources to be available and only then running. This way you guarantee the maximum processor use.

Consumer thread:

while (!terminate)
{
    synchronized (Producer.getLockObject())
    {
        try
        {
            //sleep (no processing at all)
            Producer.getLockObject().wait(); 
        }
        catch (Exceptions..)
    }

    MyObject p = Producer.getObjectFromQueue(); //this function should be synchronized

    //Analyse fetched data, and submit it to somewhere...   
}    

Producer thread:

while (!terminate)
{
    MyObject newData = fetchData(); //fetch data from remote location

    addDataToQueueu(newData); //this should also be synchronized

    synchronized (getLockObject())
    {
        //wake up one thread to deal with the data
        getLockObject().notify();
    }
}

You see that this way, your threads are always performing useful work or sleeping. This is just draft code to exemplify. See more explanation here: http://www.javamex.com/tutorials/wait_notify_how_to.shtml and here: http://www.java-samples.com/showtutorial.php?tutorialid=306

Priority won't help, since the problem is not an issue of deciding who gets precious resources -- resource usage isn't maxed. The only way the producer thread would not be getting enough CPU time is if it wasn't ready-to-run. Priority won't help, since the problem is not an issue.

How many cores does the machine have? It's possible that the producer thread is running full speed and there still just isn't enough CPU to go around. It's also possible the producer is I/O bound.

You can try to separate the producer thread from the pool (i.e. create a distinct Thread and set the pool to have -1 the current capacity) and then set its priority to maximum via setPriority. See what happens, although priority rarely accounts for such a difference in performance.

When you say URL connection, do you mean local or remote? It could be that network speed is slowing your producer down

I would think it was some OS specific issue because that is the core difference between the two units. More specifically, something is slowing down the data arriving through the remote connection.

Find some traffic analysis tool such as Wireshark and/or Networx and try to discover if there is anything throttling the Win PC. Perhaps it is going through a proxy that has some kind of rate cap configured.

Sorry not really an answer but did not fit inside comment and still it is worth the read I think:

  • well i am not JAVA friendly
  • but i have recently the same problem with C++ projects for machine control through USB.
  • On XP or W2K all goes perfectly for months of 24/7 operation on any 2 or more core machine
  • On W7 and strong enough machine all goes OK but sometimes (cca 1x per few hours) freezes for few seconds without obvious reason.
  • On W7 and relatively weak machine (2 core 1.66GHz T2300E notebook) the threads are freezing for some time and run again which under/overflows USB/WIN/App FIFOs and collapse communication ...
    • it appears that nothing is blocked but the W7 sheduler just do not give CPU to the right threads occasionally.
    • i thought that USB driver (JUNGO) communication freezes bud that is not true I measured it and it is OK even in freeze
    • the freeze was about 6-15 seconds cca once per minute.
    • after adding some safety sleeps to threads loops the freeze has shorten to about 0.5 sec
    • but still there
    • even if App do not Under/Overflows FIFOs the windows USB driver side do (few times per minute for few ms)
  • Change of exe/threads priority and class do not affect performance on W7 (on XP,W2K work as it should)

As you can see it seems we have most likely the same problem. In my case:

  • is not I/O related (when i replace USB thread with simulation of device it behaves similar)
  • adding Sleep to time critical code helps a lot
  • error is present also in low count of threads [2 fast (17ms) + 1 slow (250ms) + App code = 4]
  • my CPU consumption on W7 slow machine is also not 100% but about 95% which is OK because I have sleeps everywhere
  • my Apps use about 40-100MB of memory but are CPU computation demanding ...
    • but not that much it could run safely on much slower machines
    • but because of USB driver connection and multiple device support it need at least 2 cores
  • my next step is to add some kind of execution time logging/analyze to see what is happening in more detail
  • and also little rewrite of send/receive threads to see if it helps

When i learn something new/useful will add it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top