Question

I am trying to understand the 3-way radix Quicksort, and i dont understand why the the CUTOFF variable there? and the insertion method?

public class Quick3string {

    private static final int CUTOFF =  15;   // cutoff to insertion sort

    // sort the array a[] of strings
    public static void sort(String[] a) {
        // StdRandom.shuffle(a);
        sort(a, 0, a.length-1, 0);
        assert isSorted(a);
    }

    // return the dth character of s, -1 if d = length of s
    private static int charAt(String s, int d) { 
        assert d >= 0 && d <= s.length();
        if (d == s.length()) return -1;
        return s.charAt(d);
    }


    // 3-way string quicksort a[lo..hi] starting at dth character
    private static void sort(String[] a, int lo, int hi, int d) { 

        // cutoff to insertion sort for small subarrays
        if (hi <= lo + CUTOFF) {
            insertion(a, lo, hi, d);
            return;
        }

        int lt = lo, gt = hi;
        int v = charAt(a[lo], d);
        int i = lo + 1;
        while (i <= gt) {
            int t = charAt(a[i], d);
            if      (t < v) exch(a, lt++, i++);
            else if (t > v) exch(a, i, gt--);
            else              i++;
        }

        // a[lo..lt-1] < v = a[lt..gt] < a[gt+1..hi]. 
        sort(a, lo, lt-1, d);
        if (v >= 0) sort(a, lt, gt, d+1);
        sort(a, gt+1, hi, d);
    }

    // sort from a[lo] to a[hi], starting at the dth character
    private static void insertion(String[] a, int lo, int hi, int d) {
        for (int i = lo; i <= hi; i++)
            for (int j = i; j > lo && less(a[j], a[j-1], d); j--)
                exch(a, j, j-1);
    }

    // exchange a[i] and a[j]
    private static void exch(String[] a, int i, int j) {
        String temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }

    // is v less than w, starting at character d
    private static boolean less(String v, String w, int d) {
        assert v.substring(0, d).equals(w.substring(0, d));
        return v.substring(d).compareTo(w.substring(d)) < 0; 
    }


    // is the array sorted
    private static boolean isSorted(String[] a) {
        for (int i = 1; i < a.length; i++)
            if (a[i].compareTo(a[i-1]) < 0) return false;
        return true;
    }


    public static void main(String[] args) {

        // read in the strings from standard input
        String[] a = StdIn.readAll().split("\\s+");
        int N = a.length;

        // sort the strings
        sort(a);

        // print the results
        for (int i = 0; i < N; i++)
            StdOut.println(a[i]);
    }
}

from http://www.cs.princeton.edu/algs4/51radix/Quick3string.java.html

Was it helpful?

Solution

It would appear to be used in order to invoke insertion sort for sufficiently small (size <= 15) arrays. This is most likely to speed up sorting.

OTHER TIPS

It's a simple optimization of quicksort algorithm. The cost of recursive calls in quicksort are quite high, so for small arrays insertion sort works better. So, the idea is, that if length of subarray to be sorted os below certain threshold, it's better to sort it using insertion sort than quicksort. In your example, CUTOFF variable defines that threshold, i.e. if less than 15 elements are left, they are sorted using insertion sort instead of quicksort.

The sort method above is a recursive method. And every recursive method should have some kind of base case (otherwise it will keep calling itself, eventually leading to a stack overflow).

The insertion part is the base case in that method, because at every recursive step, the hi-lo difference keeps decreasing, & when its less than CUTOFF, the insertion sort will eventually be triggered, forcing the recursion to stop.

if (hi <= lo + CUTOFF) {       // base case
    insertion(a, lo, hi, d);
    return;
}

Now, why insertion ? Because it works well for small arrays. More on sorting here: http://www.sorting-algorithms.com/

This idea comes from Robert Sedgewick, who knows more about Quicksort than probably any man alive. It is cited in Donald E. Knuth, The Art of Computer Programming, Volume III, where he shows that for small M, insertion sort is faster than Quicksort, so he recommends not sorting small partitions < M at all and leaving it to one final insertion sort over the whole data set at the end. Knuth gives a function for calculating M (which is your CUTOFF), and which is 9 for his MIX pseudo-computer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top