Question

Suppose I have the following code which I wan't to optimize by spreading the workload over the multiple CPU cores of my PC:

double[] largeArray = getMyLargeArray();
double result = 0;
for (double d : largeArray)
    result += d;
System.out.println(result);

In this example I could distribute the work done within the for-loop over multiple threads and verify that the threads have all terminated before proceeding to printing the result. I therefore came up with something that looks like this:

final double[] largeArray = getMyLargeArray();
int nThreads = 5;
final double[] intermediateResults = new double[nThreads];

Thread[] threads = new Thread[nThreads];
final int nItemsPerThread = largeArray.length/nThreads;
for (int t = 0; t<nThreads; t++) {
    final int t2 = t;
    threads[t] = new Thread(){
        @Override public void run() {
            for (int d = t2*nItemsPerThread; d<(t2+1)*nItemsPerThread; d++)
                intermediateResults[t2] += largeArray[d];
        }
    };
}
for (Thread t : threads)
    t.start();
for (Thread t : threads)
    try {
        t.join();
    } catch (InterruptedException e) { }
double result = 0;
for (double d : intermediateResults)
    result += d;
System.out.println(result);

Assume that the length of the largeArray is dividable by nThreads. This solution works correctly.

However, I am encountering the problem that the above threading of for-loops occurs a lot in my program, which causes a lot of overhead due to the creation and garbage collection of threads. I am therefore looking at modifying my code by using a ThreadPoolExecutor. The threads giving the intermediate results would then be reused in the next execution (summation, in this example).

Since I store my intermediate results in an array of a size which has to be known beforehand, I was thinking of using a thread pool of fixed size. I am having trouble, however, with letting a thread know at which place in the array it should store its result. Should I define my own ThreadFactory?

Or am I better of using an array of ExecutorServices created by the method Executors.newSingleThreadExecutor(ThreadFactory myNumberedThreadFactory)?

Note that in my actual program it is very hard to replace the double[] intermediateResults with something of another type. I would prefer a solution which is confined to creating the right kind of thread pool.

Was it helpful?

Solution

I am having trouble, however, with letting a thread know at which place in the array it should store its result. Should I define my own ThreadFactory?

No need for that. The interfaces used by executors (Runnable and Callable) are run by threads, and you can pass whatever arguments to implementations you want to pass (for instance, an array, a begin index and an end index).

A ThreadPoolExecutor is indeed a good solution. Also look at FutureTask if you have runnables bearing results.

OTHER TIPS

ExecutorService provides you with API to get the result from thread pool via Future interface:

 Future<Double> futureResult = executorService.submit(new Callable<Double>() {
     Double call() {
         double totalForChunk = 0.0;
         // do calculation here
         return totalForChunk;
     }
 });

Now all you need to do is to submit tasks (Callable instances) and wait for result to be available:

 List<Future<Double>> results = new ArrayList<Double>();
 for (int i = 0; i < nChunks; i++) {
     results.add(executorService.submit(callableTask));
 }

Or even simpler:

 List<Future<Double>> results = executorService.invokeAll(callableTaskList);

The rest is easy, iterate over results and collect total:

 double total = 0.0;
 for (Future<Double> result : results) {
     total += result.get(); // this will block until your task is completed by executor service
 }

Having that said, you do not care how much threads you have in executor service. You just submit tasks and collect results when they are available.

You would be better off creating "worker" threads that take information about work to be performed from a queue. Your process would then be to create an initially empty WorkQueue and then create and start the worker threads. Each worker thread would pick up its work from the queue, do the work, and put the work on a "finished" queue for the master to pick up and handle.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top