Question

I have found this code in github for quickselect algorithm otherwise known as order-statistics. This code works fine.

I do not understand medianOf3 method, which is supposed to arrange the first, middle and last index in sorted order. but actually it does not when I ouput the array, after calling the medianof3 method. I could follow this method as to what it is doing except the last call of swap(list, centerIndex, rightIndex - 1);. can someone explain why this is called?

import java.util.Arrays;



/**
* This program determines the kth order statistic (the kth smallest number in a
* list) in O(n) time in the average case and O(n^2) time in the worst case. It
* achieves this through the Quickselect algorithm.
*
* @author John Kurlak <john@kurlak.com>
* @date 1/17/2013
*/
public class Quickselect {
   /**
* Runs the program with an example list.
*
* @param args The command-line arguments.
*/
   public static void main(String[] args) {
       int[] list = { 3, 5, 9, 10, 7, 40, 23, 45, 21, 2 };
       int k = 6;
       int median = medianOf3(list, 0, list.length-1);
       System.out.println(median);
       System.out.println("list is "+ Arrays.toString(list));
       Integer kthSmallest = quickselect(list, k);

       if (kthSmallest != null) {
           System.out.println("The kth smallest element in the list where k=" + k + " is " + kthSmallest + ".");
       } else {
           System.out.println("There is no kth smallest element in the list where k=" + k + ".");
       }
       System.out.println(Arrays.toString(list));
   }

   /**
* Determines the kth order statistic for the given list.
*
* @param list The list.
* @param k The k value to use.
* @return The kth order statistic for the list.
*/
   public static Integer quickselect(int[] list, int k) {
       return quickselect(list, 0, list.length - 1, k);
   }

   /**
* Recursively determines the kth order statistic for the given list.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @param k The k value to use.
* @return The kth order statistic for the list.
*/
   public static Integer quickselect(int[] list, int leftIndex, int rightIndex, int k) {
       // Edge case
       if (k < 1 || k > list.length) {
           return null;
       }

       // Base case
       if (leftIndex == rightIndex) {
           return list[leftIndex];
       }

       // Partition the sublist into two halves
       int pivotIndex = randomPartition(list, leftIndex, rightIndex);
       int sizeLeft = pivotIndex - leftIndex + 1;

       // Perform comparisons and recurse in binary search / quicksort fashion
       if (sizeLeft == k) {
           return list[pivotIndex];
       } else if (sizeLeft > k) {
           return quickselect(list, leftIndex, pivotIndex - 1, k);
       } else {
           return quickselect(list, pivotIndex + 1, rightIndex, k - sizeLeft);
       }
   }

   /**
* Randomly partitions a set about a pivot such that the values to the left
* of the pivot are less than or equal to the pivot and the values to the
* right of the pivot are greater than the pivot.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @return The index of the pivot.
*/
   public static int randomPartition(int[] list, int leftIndex, int rightIndex) {
       int pivotIndex = medianOf3(list, leftIndex, rightIndex);
       int pivotValue = list[pivotIndex];
       int storeIndex = leftIndex;

       swap(list, pivotIndex, rightIndex);

       for (int i = leftIndex; i < rightIndex; i++) {
           if (list[i] <= pivotValue) {
               swap(list, storeIndex, i);
               storeIndex++;
           }
       }

       swap(list, rightIndex, storeIndex);

       return storeIndex;
   }

   /**
* Computes the median of the first value, middle value, and last value
* of a list. Also rearranges the first, middle, and last values of the
* list to be in sorted order.
*
* @param list The list.
* @param leftIndex The left index of the current sublist.
* @param rightIndex The right index of the current sublist.
* @return The index of the median value.
*/
   public static int medianOf3(int[] list, int leftIndex, int rightIndex) {
       int centerIndex = (leftIndex + rightIndex) / 2;

       if (list[leftIndex] > list[rightIndex]) {
           swap(list, leftIndex, centerIndex);
       }

       if (list[leftIndex] > list[rightIndex]) {
           swap(list, leftIndex, rightIndex);
       }

       if (list[centerIndex] > list[rightIndex]) {
           swap(list, centerIndex, rightIndex);
       }

       swap(list, centerIndex, rightIndex - 1);

       return rightIndex - 1;
   }

   /**
* Swaps two elements in a list.
*
* @param list The list.
* @param index1 The index of the first element to swap.
* @param index2 The index of the second element to swap.
*/
   public static void swap(int[] list, int index1, int index2) {
       int temp = list[index1];
       list[index1] = list[index2];
       list[index2] = temp;
   }
}
Was it helpful?

Solution

So I wrote the original code, but I did a poor job of making it readable.

Looking back at it, I don't think that line of code is necessary, but I think it is a small optimization. If we remove the line of code and return centerIndex, it seems to work without any issues.

Unfortunately, the optimization it performs should be refactored out of medianOf3() and moved into randomPartition().

Essentially, the optimization is that we want to "partially sort" our subarray as much as possible before partitioning it. The reason being this: the more sorted our data is, the better our future partition choices will be, which means our run time will hopefully be closer to O(n) than O(n^2). In the randomPartition() method, we move the pivot value to the far right of the subarray we're looking at. This moves the far right value into the middle of the subarray. This is not desired since the far right value is supposed to be a "larger value". My code tries to prevent this by placing the pivot index right next to the rightmost index. Then, when the pivot index is swapped with the rightmost index in randomPartition(), the "larger" rightmost value doesn't move into the middle of the subarray, but stays near the right.

OTHER TIPS

Function medianOf3 is to define order of left median and right. Last statement

swap(list, centerIndex, rightIndex - 1)

is used to achieve following precondition of a sort:

However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side – the side with the element it is searching for. This reduces the average complexity from O(n log n) (in quicksort) to O(n) (in quickselect).

And then algorithm continues with:

   for (int i = leftIndex; i < rightIndex; i++) {
       if (list[i] <= pivotValue) {
           swap(list, storeIndex, i);
           storeIndex++;
       }
   }

in order to

that the values to the left of the pivot are less than or equal to the pivot and the values to the right of the pivot are greater than the pivot.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top