Question

I learnt that in a order-statistic tree (augmented Red-Black Tree, in which each node $x$ contains an extra field denoting the number of nodes in the sub-tree rooted at $x$) finding the $i$ th order statistics can be done in $O(lg(n))$ time in the worst case. Now in case of an array representing the dynamic set of elements finding the $i$ th order statistic can be achieved in the $O(n)$ time in the worst case.[ where $n$ is the number of elements].

Now I felt like finding a tight upper bound for forming an $n$ element Red-Black Tree so that I could comment about which alternative is better : "maintain the set elements in an array and perform query in $O(n)$ time" or "maintaining the elements in a Red-Black Tree (formation of which takes $O(f(n))$ time say) and then perform query in $O(lg(n))$ time".


So a very rough analysis is as follows, inserting an element into an $n$ element Red-Black Tree takes $O(lg(n))$ time and there are $n$ elements to insert , so it takes $O(nlg(n))$ time. Now this analysis is quite loose as when there are only few elements in the Red-Black tree the height is quite less and so is the time to insert in the tree.

I tried to attempt a detailed analysis as follows (but failed however):

Let while trying to insert the $j=i+1$ th element the height of the tree is atmost $2.lg(i+1)+1$. For an appropriate $c$, the total running time,

$$T(n)\leq \sum_{j=1}^{n}c.(2.lg(i+1)+1)$$

$$=c.\sum_{i=0}^{n-1}(2.lg(i+1)+1)$$

$$=c.\left[\sum_{i=0}^{n-1}2.lg(i+1)+\sum_{i=0}^{n-1}1\right]$$

$$=2c\sum_{i=0}^{n-1}lg(i+1)+cn\tag1$$

Now

$$\sum_{i=0}^{n-1}lg(i+1)=lg(1)+lg(2)+lg(3)+...+lg(n)=lg(1.2.3....n)\tag2$$

Now $$\prod_{k=1}^{n}k\leq n^n, \text{which is a very loose upper bound}\tag 3$$

Using $(3)$ in $(2)$ and substituting the result in $(1)$ we have $T(n)=O(nlg(n))$ which is the same as the rough analysis...

Can I do anything better than $(3)$?


All the nodes referred to are the internal nodes in the Red-Black Tree.

Was it helpful?

Solution

To construct a red-black tree on $n$ elements you need to spend time $\Omega(n \log n)$, if you are only allowed to compare the elements' keys. To see this notice that an in-order visit of any BST visits the nodes in increasing order of their keys.

If you were able to construct a red-black tree in time $t(n) = o(n \log n)$ then you'd also be able to sort $n$ elements in time $O(t(n) + n) = o(n \log n)$, contradicting the lower bound on sorting for comparison-based algorithms.

On the other hand, if the elements are already sorted then you can build a red-black-tree in time $O(n)$: just recursively build a balanced BST, if the last level is incomplete color its nodes as red, and color every other node black. This requires linear time since the time complexity of the recursive algorithm can be described by the recurrence equation $T(n) = 2T(n/2) + O(1)$.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top