Given an unsorted list of $n$ items, how many random comparisons are needed on average to be able to sort the list?

https://cs.stackexchange.com/questions/118531

28-09-2020
|

Question

There is an unsorted list of $n$ items $x_1, \ldots, x_n$. Until you can sort the list, you are given one of the ${n \choose 2}$ possible binary comparisons uniformly at random (with replacement).

On average, how many of these random comparisons will you need to be able to sort the list?

Some follow-up questions:

What does the distribution for the number of comparisons needed look like?
What is the answer if you use $k$-ary comparisons instead of binary ones.
What is the answer if the comparisons are made without replacement (i.e. you won't get the same comparison twice)?
Given a set of comparisons, how can one check if the list is sort-able? I'm almost certain the answer is to construct a DAG and topological sort, but I just want to confirm.

An exact-ish answer would be nice, but a big $O$ answer is fine too, I suppose.

Solution

Asympotically, you'll need $\Theta(n^2 \log n)$ comparisons.

Suppose $x_{(1)},\dots,x_{(n)}$ denotes the elements in sorted order. Then if you don't see a comparison between $x_{(1)}$ and $x_{(2)}$, you will have no way to tell which order they should appear in. The same is true of every pair of adjacent element. So, there are $n-1$ coupons (one per adjacent pair of elements), and you need to collect them all. Based on the coupon collector problem, we know you'll need $\Theta(n \log n)$ randomly chosen coupons before we've collected them all. Each observation has a $2/n$ chance of being a coupon, so in total we'll need $\Theta(n^2 \log n)$ observations before we've collected all the coupons. If we collect all the coupons, we can sort the $x$'s; if we're missing any coupon, we can't sort them.

The subsequent questions boil down to facts about the coupon collector problem, and you can use the tail bounds on Wikipedia to bound the information about the distribution.

If comparisons are chosen without replacement, then you need $\Theta(n^2)$ observations.

A reasonable way to check whether the list is sortable is to do a topological sort, then check that you've observed a comparison between every pair of adjacent items in the topologically sorted order.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange