tukey's ninther for different shufflings of same data

https://stackoverflow.com/questions/17628352

03-06-2022
|

Frage

While implementing improvements to quicksort partitioning,I tried to use Tukey's ninther to find the pivot (borrowing almost everything from sedgewick's implementation in QuickX.java)

My code below gives different results each time the array of integers is shuffled.

import java.util.Random;
public class TukeysNintherDemo{    
    public static int tukeysNinther(Comparable[] a,int lo,int hi){
        int N = hi - lo + 1;
        int mid = lo + N/2;
        int delta = N/8;
        int m1 = median3a(a,lo,lo+delta,lo+2*delta);
        int m2 = median3a(a,mid-delta,mid,mid+delta);
        int m3 = median3a(a,hi-2*delta,hi-delta,hi);
        int tn = median3a(a,m1,m2,m3);
        return tn;
    }

    // return the index of the median element among a[i], a[j], and a[k]
    private static int median3a(Comparable[] a, int i, int j, int k) {
        return (less(a[i], a[j]) ?
               (less(a[j], a[k]) ? j : less(a[i], a[k]) ? k : i) :
               (less(a[k], a[j]) ? j : less(a[k], a[i]) ? k : i));
    }

    private static boolean less(Comparable x,Comparable y){
        return x.compareTo(y) < 0;
    }
    public static void shuffle(Object[] a) {
    Random random = new Random(System.currentTimeMillis());
        int N = a.length;
        for (int i = 0; i < N; i++) {
            int r = i + random.nextInt(N-i);     // between i and N-1
            Object temp = a[i];
            a[i] = a[r];
            a[r] = temp;
        }
    }
    public static void show(Comparable[] a){    
        int N = a.length;
        if(N > 20){
            System.out.format("a[0]= %d\n", a[0]);
            System.out.format("a[%d]= %d\n",N-1, a[N-1]);
        }else{
            for(int i=0;i<N;i++){
                System.out.print(a[i]+",");
            }
        }
        System.out.println();

    }
    public static void main(String[] args) {
        Integer[] a = new Integer[]{17,15,14,13,19,12,11,16,18};
        System.out.print("data= ");
        show(a);
        int tn = tukeysNinther(a,0,a.length-1);
        System.out.println("ninther="+a[tn]);
    }
}

Running this a cuople of times gives

data= 11,14,12,16,18,19,17,15,13,
ninther=15

data= 14,13,17,16,18,19,11,15,12,
ninther=14

data= 16,17,12,19,18,13,14,11,15,
ninther=16

Will tuckey's ninther give different values for different shufflings of the same dataset? when I tried to find the median of medians by hand ,I found that the above calculations in the code are correct.. which means that the same dataset yield different results unlike a median of the dataset.Is this the proper behaviour? Can someone with more knowledge in statistics comment?

Lösung

Tukey's ninther examines 9 items and calculates the median using only those.

For different random shuffles, you may very well get a different Tukey's ninther, because different items may be examined. After all, you always examine the same array slots, but a different shuffle may have put different items in those slots.

The key here is that Tukey's ninther is not the median of the given array. It is an attempted appromixation of the median, made with very little effort: we only have to read 9 items and make 12 comparisons to get it. This is much faster than getting the actual median, and has a smaller chance of resulting in an undesirable pivot compared to the 'median of three'. Note that the chance still exists.

Does this answer you question?

On a side note, does anybody know if quicksort using Tukey's ninther still requires shuffling? I'm assuming yes, but I'm not certain.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow