Java sorting an array of words with threads

Question 1

I think you should use Merge Sort algorithm and base its implementation on ForkJoinPool (if you're using Java 7, of course).

This algorithm fits perfectly because the job can be splitted to independent tasks can be handled by different threads. Now, ForkJoinPool provides you with easy-to-use pool where you can submit your sorting tasks.

The implementation should be done like this:

Each task does sorting a piece of array/list given;
If the array is small (exact meaning of small can be configured by constant) - it is sorted using standard .sort() method, otherwise it is split in two halves and these halves are submitted to the pool for sorting;
Then the task waits for both child tasks to complete and performs merging of two sorted arrays/lists into one, and returns it;

Here is the sample implementation of the algorithm. Please be aware that this one is far away from optimal as it consumes tons of additional memory. I implemented it like that to demostrate approach. Run it with -Xmx1024m.

public class ForkJoinSort {

    private static final int LIST_SIZE = 10000;

    private static final int SORT_THRESHOLD = 10; //the minimal length of the list to use standard java sort rather than mergesort

    private static ForkJoinPool forkJoinPool = new ForkJoinPool();

    public static class MergeSortTask extends RecursiveTask<List<Integer>> {

        private final List<Integer> victim;

        public MergeSortTask(List<Integer> victim) {
            this.victim = victim;
        }

        @Override
        protected List<Integer> compute() {
            if (victim.size() < SORT_THRESHOLD) {
                Collections.sort(victim);
                return victim;
            }

            //sorting left and right parts of the list separately in separate threads
            MergeSortTask leftTask = new MergeSortTask(victim.subList(0, victim.size() / 2));
            MergeSortTask rightTask = new MergeSortTask(victim.subList(victim.size() / 2, victim.size()));
            forkJoinPool.submit(leftTask);
            forkJoinPool.submit(rightTask);

            //do merge
            return merge(leftTask.join(), rightTask.join());
        }

        public List<Integer> merge(List<Integer> left, List<Integer> right) {
            List<Integer> result = new ArrayList<Integer>(left.size() + right.size());

            Iterator<Integer> leftIterator = left.iterator();
            Iterator<Integer> rightIterator = right.iterator();

            Integer fromLeft = null;
            Integer fromRight = null;

            while (leftIterator.hasNext() || rightIterator.hasNext()) {
                //if current value taken from the iterator is null - take new one if possible, otherwise do nothing
                fromLeft = fromLeft == null ? leftIterator.hasNext() ? leftIterator.next() : null : fromLeft;
                fromRight = fromRight == null ? rightIterator.hasNext() ? rightIterator.next() : null : fromRight;

                if (fromLeft != null && (fromRight == null || fromLeft <= fromRight)) {
                    result.add(fromLeft);
                    fromLeft = null; //this is done to indicate that value from left iterator already passed to result list
                } else if (fromRight != null && (fromLeft == null || fromRight <= fromLeft)) {
                    result.add(fromRight);
                    fromRight = null;
                }
            }

            return result;
        }
    }

    public static void main(String[] args) throws Exception {
        SecureRandom random = new SecureRandom();

        //generate array of random numbers
        List<Integer> victim = new ArrayList<Integer>(LIST_SIZE);
        for (int i = 0; i < LIST_SIZE; ++i) {
            victim.add(random.nextInt());
        }

        //do some benchmarking as long as we're here
        long timeMark = System.currentTimeMillis();
        MergeSortTask task = new MergeSortTask(victim);
        forkJoinPool.submit(task);
        List<Integer> probablySorted = task.get();
        timeMark = System.currentTimeMillis() - timeMark;

        //asserting that array is sorted
        for (int i = 0; i < probablySorted.size() - 1; ++i) {
            if (probablySorted.get(i) > probablySorted.get(i + 1)) {
                throw new IllegalStateException("Sorting failed :(");
            }
        }

        System.out.println("Sorting " + LIST_SIZE + " random numbers using merge sort algorithm in " + Runtime.getRuntime().availableProcessors() + " threads took " + timeMark + " ms.");
    }
}

I tried to make code easy-readable. If I failed somewhere, do not hesitate to ask.

Question 2

As @Alexey correctly points out the easiest way to do a parallel sort is definitely using a fork/join framework and merge sort. That's very easy to do and looks something like (pseudo code):

def mergesort(a, i0, i1):
    if i0 == i1:
        return
    im = i0 + (i1 - i0) / 2
    fork mergesort(a, i0, im)
    fork mergesort(a, im, i1)
    join
    merge(a, i0, im, i1) # serial merge

If we analyse this we see that we have (easy to show by master theorem):

Work: T_1(n) = 2T_1(n / 2) + O(n) = O(n lg n)
Span: T_inf(n) = 1 T_inf(n / 2) + O(n) = O(n)

where work means the total amount of work done and span describes how long it'd take if we had infinitely many threads available (basically the depth of the tree).

The parallelism an algorithm has is basically Work / Span which in this case gives us O(lg n) - that's practically irrelevant although if we use a good serial sort algorithm for small enough leaf sizes this still works quite well.

We can do better though by parallelizing the merge. This can be done without an auxiliary array, but I'll leave that as an exercise for the reader (means: not easy and I'd have to look up how to actually do that).

Parallel merge: Assume that we have an auxiliary array aux with two sorted arrays in [i0, i1) and [j0, j1) and we want to put the merged sub arrays into array a between k0, k1. We do this recursively again:

Compute im = i0 + (i1 - i0) / 2 - aux[im] is the median of the left half
Find jm so that aux[im] is inserted directly before aux[jm]
insert aux[im] into the correct position in a, which we know if we have im and jm.
merge the two subarrays.

Confused? Well, the following illustration (I'm in CS not arts..) should help I hope: Helpful illustration

In code this looks something like

def merge(a, aux, i0, i1, j0, j1, k0, k1):
    if i0 == i1:
        copy aux[j0, j1] to a[k0, k1]
        return
    if j0 == j1:
        copy aux[i0, i1] to a[k0, k1]
        return       
    im = im = i0 + (i1 - i0) / 2 
    jm = find(aux, j0, j1, aux[im]) 
    km = k0 + (im - i0) + 1 + (jm - j0 )
    a[km] = aux[im]
    fork merge(a, aux, i0, im, j0, jm, k0, km)
    fork merge(a, aux, im + 1, i1, jm, j1, km + 1, k1)
    join

It's important to note that find has to be done using a simple serial binary search in O(lg n) since we know the right side is sorted.

Using such a parallel merge gives us the same work, but reduces the span to O(lg^3 n), which translates to a parallelism of O(n / lg^2 n) - which is a great improvement.

Nota bene: As for any parallel algorithm in practice you will want to use a simple serial version if the problem sizes get too small (quicksort or something) - which leaf size is the best has to be evaluated for each architecture separately through experimentation.

Question 3

I am a teaching assistant (and examiner for the assignment in question) for the university course you're following. The answers you've been given to your question(s) are great, and probably describe THE best ways to solve this problem for optimal performance and speed-up compared to a fully sequential sort+merge. You should keep in mind, though, that this is a beginner's course in object oriented programming, and your very first assignment touching parallelism and multithreading.

As there are only 14 hours left until the deadline, I would not recommend you to go for an advanced approach to the problem, like extending library classes such as ForkJoinPool, parallelising a double-pivot quicksort, or whatnot. The simplest solution to this problem, which is also the solution we had in mind when we gave you the assignment, can be implemented following these steps:

Algorithm:

n = number of threads

Read the words into a primitive String array from the file.
Split the array into n approximately equally long pieces, or alternatively, create a function which assigns indexes in the initial array to your threads.
Create the threads, giving them a reference ("Java pointer") to your monitor object.
Once the threads have been assigned parts of your initial array, start sorting them using whatever algorithm you would like. Insertion sort is probably the "easiest" algorithm to implement if you're short on time.
Once sorting is finished in a certain thread, have the thread report back to the monitor, letting it know sorting has been finished.
When two threads have reported back, have one of the threads perform an initially-sorted merge sort (description further down) on the arrays sorted by the threads.
Write the result back to a new file on your system

"Initially-sorted merge sort"

Create two ints, i and j, and let them represent indexes in the two initially sorted arrays.
Create an array as long as the two arrays' lengths added together.
Iterate from 0 through result.length, and check whether array1[i] is smaller than array2[j]. 3.1 If it is, add array1[i] to your result array, and increment i. 3.2 If it is not, add array2[j] to your result array, and increment j.
Once you've reached the end of either initial array, simply add the rest of the other array to the result array.
At the end of the for-loop, your result array is sorted, and all the strings from array1 and array2 are contained in your result array.

Good luck with your assignment!