Maintaining balanced BSTs in order to get $\frac{n}{2}$ largest elements in constant time
-
29-09-2020 - |
Question
I wonder what is the best data structure I can use in order to achieve the following:
Given a data structure based on a balanced BST, I would like to get a tree with the $\lfloor\frac{n}{2}\rfloor$ greatest values in $O(1)$.
I thought about maintaining another balanced BST and a variable containing the middle value, but I don't think it's a good idea, since the root of a balanced BST might not divide the tree into 2 halves so that the $\lfloor\frac{n}{2}\rfloor$ largest elements are to the right (the tree might be right heavy). Another idea was to use heap, but I don't see how it can help me.
Do you have any idea how to implement it? What do you think about mine?
Solution
Maintain two BSTs $T_1$ and $T_2$ such that $T_2$ always contains the largest $\lfloor n/2 \rfloor$ elements, and $T_1$ contains the remaining elements.
Insertions and deletions can be implemented in $O(\log n)$ time by performing the corresponding operation in a suitably chosen tree and then moving at most one element from one tree to the other.
For example, to insert an element $x$ proceed as follows:
- Find maximum element $m$ of $T_1$
- If $x \le m$:
- Insert $x$ in $T_1$
- If $|T_2| \neq \lfloor n/2 \rfloor$:
- Delete $m$ from $T_1$
- Insert $m$ in $T_2$
- Otherwise:
- Insert $x$ in $T_2$
- If $|T_2| \neq \lfloor n/2 \rfloor$:
- Find the minimum element $m'$ of $T_2$
- Delete $m'$ from $T_2$
- Insert $m'$ in $T_1$
Deletions are handled similarily.
The same approach also works if you maintain two heaps (a max-heap and a min-heap) instead of two BSTs.
See also this answer that uses a similar idea.