Question

I have a uniformly randomly permuted list of length $n$. I walk through the list element-by-element, and delete an element if it's out-of-order (compared to the previous in-order elements of the list). What I'm left with is a shorter list that's guaranteed to be in-order.

Example: The starred elements of this list would be deleted during the walk through.

[1, 4, 2*, 3*, 5, 7, 6*] -> [1, 4, 5, 7]

My question is: On average, what will be the length of the remaining list? And is there some simple/obvious form of the distribution of this?

And as one follow-up: if we used this for a sorting algorithm (called Drop Sort), what would the run-time be? To be clear, the algorithm would be something like this: $$\textit{DropSort}(\text{list}) = \textit{Merge}(\text{Remaining list}, \textit{DropSort}(\text{Rejected list}))$$

This question was inspired by this r/ProgrammerHumor post.

Was it helpful?

Solution

You can compute the average using linearity of expectation.

Let the random variable $X$ denote the number of elements that are retained. Let $X_i$ be an indicator r.v. that is 1 if the $i$th element is retained, or 0 otherwise. Then $X = X_1+\dots + X_n$, so

$$\mathbb{E}[X] = \mathbb{E}[X_1] + \dots + \mathbb{E}[X_n] = \Pr[X_1=1] + \dots + \Pr[X_n=1].$$

Now the $i$th element is retained if and only if it is the largest number among the first $i$ elements in the list. Therefore, $\Pr[X_i=1]=\frac{1}{i}$, so

$$\mathbb{E}[X] = \frac{1}{1} + \frac{1}{2} + \dots + \frac{1}{n} \approx \log(n) + \gamma$$

In other words, the average length of the remaining list is $O(\log n)$.

I don't know what the distribution is.

As a heuristic, the expected running time of Drop Sort would satisfy a recurrence something like $T(n) \approx T(n-\log(n)) + O(n)$, which corresponds to $O\left(\frac{n^2}{\log n}\right)$ time.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top