Question

I am trying to understand average no of exchanges in Quicksort.

Here is the code to partition the array -

 private static int partition(double[] a, int lo, int hi) {
        int i = lo;
        int j = hi + 1;
        double v = a[lo];
        while (true) { 

            // find item on lo to swap
            while (less(a[++i], v))
                if (i == hi) break;

            // find item on hi to swap
            while (less(v, a[--j]))
                if (j == lo) break;      // redundant since a[lo] acts as sentinel

            // check if pointers cross
            if (i >= j) break;

            exch(a, i, j);
        }

        // put v = a[j] into position
        exch(a, lo, j);

        // with a[lo .. j-1] <= a[j] <= a[j+1 .. hi]
        return j;
    }

To start with in first partition stage -

There are total comparisons as below $ C_N= N +1 $

so exchanges in worst case should be $ \frac{N}{2} - 1 $

But am not able to deduce number of exchanges in first partition stage which is $ \frac{(N-2)}{6} $

can someone explain how to deduce this?

Was it helpful?

Solution

The partition method in the question, partition(a, lo, hi) is called Hoare’s partition scheme, which is the most classic partition scheme used in quicksort.

Here is the situation. There are $n$ numbers that are non-equal pairwise. Array $a=(a_0, a_1, \cdots, a_{n-1})$ is a uniformly-random permutation of those $n$ numbers. We will run the method partition(a, 0, n-1). What is the number of exchanges executed inside the loop while(true)? (Note that the exchange exch(a, lo, j) outside the loop is not counted.)

WLOG, let the $n$ numbers be $1, 2, \cdots, n$.

Suppose the first element of $a$ is $p$ (for pivot). Excluding that first element, split $a$ into one segment of $p-1$ numbers followed by another segment of $n-p$ numbers.

$$a = (p,\ \underbrace{a_1,\ a_2,\ \cdots,\ a_{p-1},}_{p-1\text{ numbers}}\ \ \underbrace{a_p,\ a_{p+1},\ \cdots\ a_{n-1}}_{n-p\text{ numbers}})$$

Observe the number of elements greater than $p$ in the front segment is always the same as the number of elements smaller than $p$ in the back segment. (Proof: let the two numbers be $x$ and $y$ respectively. The total number of elements greater than $p$ in $a$ is $x + ( n-p-y)$. On the other hand, the total number of numbers greater than $p$, which are $p+1, p+2, \cdots, n$, is $n-p$. That means, $x+(n-p-y)=n-p$, which means $x=y$.)

In the while loop of partition(a, 0, n-1), the first element greater than $p$ in the front segment is exchanged with the last element smaller than $p$ in the back segment. Then the second element greater than $p$ in the front segment is exchanged with the second last element smaller than $p$ in the back segment. And so on, until we have exhausted elements greater than $p$ in the front segment (and, in the meantime, we have exhausted elements smaller than $p$ in the back segment as well). So the number of exchanges made is equal to the number of elements greater than $p$ in the front segment.

To calculate the average number of exchanges made, let us run partition(a, 0, n-1) as $a$ ranges over all permutations of $1, 2, \cdots, n$. Then

$$\text{total number of exchanges}=\sum_{p=1}^n\sum_{a_0=p}\text{number of exchanges on }a\\ =\sum_{p=1}^n\sum_{x}^{}x\cdot(\text{number of permutations that starts with }p\text{ with }x\text{ exchanges performed})$$

As we have explained above, for a permutation that starts with $p$, $x$ exchanges will be performed on it if and only if that permutation has $x$ elements greater than $p$ in its front segment of $p-1$ elements. The number of such kind of permutations is the product of the following four factors.

$$ \underbrace{\binom{n-p}{x}}_{\text{the number of ways to choose }\quad\ \ \\x\text{ elements out of all }n-p\\\text{numbers that are greater than }\ \\p \text{ for the front segment}} \underbrace{\binom{p-1}{p-1-x}}_{\text{the number of ways to chose }\\p-1-x\text{ elements out of all}\\p-1 \text{ numbers that are smaller }\quad\ \ \\\text{than }p\text{ for the front segment}} \underbrace{(p-1)!}_{\text{the number of ways}\\\text{to permute all }p-1\\\text{elements in the}\\\text{front segment}}\quad\ \ \underbrace{(n-p)!}_{\text{the number of ways}\\\text{to permute all }\\n-p\text{ elements in}\\\text{the back segment}}\\ $$

So,

$$\begin{aligned}&\quad\text{total number of exchanges}\\ &=\sum_{p=1}^n\sum_{x=0}^{\min(n-p, p-1)}x\binom{n-p}{x}\binom{p-1}{p-1-x}(p-1)!(n-p)!\\ &=\sum_{p=1}^n\left((p-1)!(n-p)!\sum_{x=0}^{\min(n-p, p-1)}x\binom{n-p}{x}\binom{p-1}{p-1-x}\right)\\ &=\sum_{p=1}^n\left((p-1)!(n-p)!\sum_{x=1}^{\min(n-p, p-1)}(n-p)\binom{n-p-1}{x-1}\binom{p-1}{p-1-x}\right)\\ &=\sum_{p=1}^n\left((p-1)!(n-p)!(n-p)\sum_{y=0}^{\min(n-p-1, p-2)}\binom{n-p-1}{y}\binom{p-1}{p-2-y}\right)\\ &\stackrel{\bigstar}{=}\sum_{p=1}^n(p-1)!(n-p)!(n-p)\binom{n-2}{p-2}\\ &=\sum_{p=1}^n(p-1)(n-p)(n-2)!\\ &=(n-2)!\left((n+1)\sum_{p=1}^np-\sum_{p=1}^nn-\sum_{p=1}^np^2\right)\\ &=(n-2)!\left((n+1)\frac{n(n+1)}2-n^2-\frac{n(n+1)(2n+1)}6\right)\\ &=n-2)!\,\frac{n(n-1)(n-2)}6=n!\,\frac{n-2}6.\\ \end{aligned}$$

The average number of exchanges is $$\frac{\text{total number of exchanges}}{\text{number of permutations}} =\frac n6-\frac13 \color{#d0d0d0}{\quad\text{for } n\ge2}.$$


You might wonder the equality above that is marked with a $\bigstar$. It can be established by splitting into two cases, when $n-p-1\lt p-2$ and when $n-p-1\ge p-2$, and then using Vandermonde's identity.

Another way to prove it uniformly is to rewrite the Vandermonde's identity as, for any non-negative integer $m, n, k$, $$\sum_{r=-\infty}^{\infty}\binom{m}{k}\binom{n}{r-k}=\binom{m+n}{r},$$ where we extend the definition of the binomial coefficient $\binom nk$ so that for all $n\ge0$ we have $\binom nk=0$ for all $k\lt 0$ or $k\gt n$. Note that the summation on the left-hand-side ranges is well-defined, since it contains, in fact, only finitely many non-zero items. We have, $$\sum_{y=0}^{\min(n-p-1, p-2)}\binom{n-p-1}{y}\binom{p-1}{p-2-y}=\sum_{y=-\infty}^{\infty}\binom{n-p-1}{y}\binom{p-1}{p-2-y}\\=\binom{(n-p-1)+(p-1)}{y+(p-2-y)}=\binom{n-2}{p-2}.$$

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top