Bound on space for selection algorithm?

https://cs.stackexchange.com/questions/2893

16-10-2019
|

Question

There is a well known worst case $O(n)$ selection algorithm to find the $k$'th largest element in an array of integers. It uses a median-of-medians approach to find a good enough pivot, partitions the input array in place and then recursively continues in it's search for the $k$'th largest element.

What if we weren't allowed to touch the input array, how much extra space would be needed in order to find the $k$'th largest element in $O(n)$ time? Could we find the $k$'th largest element in $O(1)$ extra space and still keep the runtime $O(n)$? For example, finding the maximum or minimum element takes $O(n)$ time and $O(1)$ space.

Intuitively, I cannot imagine that we could do better than $O(n)$ space but is there a proof of this?

Can someone point to a reference or come up with an argument why the $\lfloor n/2 \rfloor$'th element would require $O(n)$ space to be found in $O(n)$ time?

Solution

It is an open problem if you can do selection with $O(n)$ time and $O(1)$ extra memory cells without changing the input (see here). But you can come pretty close to this.

Munro and Raman proposed an algorithm for selection that runs in $O(n^{1+\varepsilon})$ time while using only $O(1/\varepsilon)$ extra storage (cells). This algorithm leaves the input unchanged. You can pick any small $\varepsilon>0$.

At its core, Munro and Raman's algorithm works as the classical $O(n)$ algorithm: It maintains a left and right bound (called filter), which are two elements with known rank. The requested element is contained between the two filters (rank-wise). By picking a good pivot element $p$ we can check all numbers against the filters and $p$. This makes it possible to update the filters and decreases the number of elements left to check (rank-wise). We repeat until we have found the request element.

What is different to the classical algorithm is the choice of $p$. Let $A(k)$ be the algorithm that solves selection for $\varepsilon=1/k$. The algorithm $A(k)$ divides the array in equally-sized blocks and identifies a block where many elements are, whose ranks are in between the filters (existence by pigeon-hole principle). This block will then be scanned for a good pivot element with help of the algorithm $A(k-1)$. The recursion anchor is the trivial $A(1)$ algorithm. The right block size (and doing the math) gives you running time and space requirements as stated above.

Btw, the algorithms you are looking for, were recently named constant-work-space algorithms.

I am not aware of any lower bound.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange