Pergunta

The randomized selection algorithm is the following:

Input: An array $A$ of $n$ (distinct, for simplicity) numbers and a number $k\in [n]$

Output: The the "rank $k$ element" of $A$ (i.e., the one in position $k$ if $A$ was sorted)

Method:

  • If there is one element in $A$, return it
  • Select an element $p$ (the "pivot") uniformly at random
  • Compute the sets $L = \{a\in A : a < p\}$ and $R = \{a\in A : a > p\}$
  • If $|L| \ge k$, return the rank $k$ element of $L$.
  • Otherwise, return the rank $k - |L|$ element of $R$

I was asked the following question:

Suppose that $k=n/2$, so you are looking for the median, and let $\alpha\in (1/2,1)$ be a constant. What is the probability that, at the first recursive call, the set containing the median has size at most $\alpha n$?

I was told that the answer is $2\alpha - 1$, with the justification "The pivot selected should lie between $1−\alpha$ and $\alpha$ times the original array"

Why? As $\alpha \in (0.5, 1)$, whatever element is chosen as pivot is either larger or smaller than more than half the original elements. The median always lies in the larger subarray, because the elements in the partitioned subarray are always less than the pivot.

If the pivot lies in the first half of the original array (less than half of them), the median will surely be in the second larger half, because once the median is found, it must be in the middle position of the array, and everything before the pivot is smaller as stated above.

If the pivot lies in the second half of the original array (more than half of the elements), the median will surely first larger half, for the same reason, everything before the pivot is considered smaller.

Example:

3 4 5 8 7 9 2 1 6 10

The median is 5.

Supposed the chosen pivot is 2. So after the first iteration, it becomes:

1 2 ....bigger part....

Only 1 and 2 are swapped after the first iteration. Number 5 (the median) is still in the first greater half (accroding to the pivot 2). The point is, median always lies on greater half, how can it have a chance to stay in a smaller subarray?

Foi útil?

Solução

Suppose your array has $n$ elements. As you have noted, the median is always in the bigger part after the first partition. The bigger part has size at most $\alpha n$ if the smaller part has size at least $(1-\alpha) n$. This happens when you pick a pivot that isn't one of the smallest or largest $(1-\alpha) n$ elements. Because $\alpha > 1/2$, you know these are disjoint sets, so the probability of hitting one of the bad pivots is just $2 - 2\alpha$, and $1- 2 + 2\alpha=2\alpha - 1$.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top