Why does heapsort run in $\Theta(n \log n)$ instead of $\Theta(n^2 \log n)$ time?
-
16-10-2019 - |
Question
I am reading section 6.4 on Heapsort algorithm in CLRS, page 160.
HEAPSORT(A)
1 BUILD-MAX-HEAP(A)
2 for i to A.length downto 2
3 exchange A[i] with A[i]
4 A.heap-size = A.heap-size-1
5 MAX-HEAPIFY(A,1)
Why is the running time, according to the book is $\Theta (n\lg{n})$ rather than $\Theta (n^2\lg{n})$ ? BUILD-MAX-HEAP(A)
takes $\Theta(n)$, MAX-HEAPIFY(A,1)
takes $\Theta(\lg{n})$ and repeated $n-1$ times (line 3).
Solution
Let us count operations line by line. You construct the heap in linear time. Then, you execute the loop and perform a logarithmic time operation $n-1$ times. Other operations take constant time. Hence, your running time is
$\qquad \begin{align} & n + (n-1) \log n + O(1) \\ &= n + n \log n - \log n + O(1) \\ &= \Theta(n \log n). \end{align}$
In other words, as $n$ grows the $n \log n$ term dominates. That is, the cost of building the heap on line 1 is negligible compared to the cost of executing the loop.
OTHER TIPS
I had a different view altogether on this. I could not ignore the fact that in the MAX_HEAPIFY step in the sort (which has a order of $\log n$, where $n$ is the heap size) receives a diminished heap size in each iteration. So to me the Summation looked more or less like $\sum_{k=1}^{n-1}\log k\approx \log (n!)$. Now I know that $n! =\Theta(n^n)$ that way $\log(n!)=\Theta(n\log n)$, but it was never convincing.