Question

Recently, I came across a question about finding sum of all values in range $[low, high]$ in BST $T$.

Then I formulated following algorithm to carry out that task:

  1. We do inorder traversal of BST ($sum = 0$ in start)
  2. But whenever we encounter any node having value strictly less than $low$ than we don't visit it's left subtree. We only traverse it's right subtree recursively in same manner.
  3. In same manner if we find any node having value strictly greater than $high$ than we don't traverse it's right subtree.
  4. If value of node falls in given range then we increment $sum$ by 1.

I think this algorithm correctly solve above problem in time complexity $O(m+lg(n))$ where $m$ is number of nodes having value in given range.

My reasoning for that time complexity is that we traverse at most $O(lg(n))$ nodes(This statement I'm not able to prove) which don't get added and we traverse $m$ nodes that gets added into $sum$. Which gives above time complexity.

At least any hint regarding how to prove this time complexity would be great.

Thanks.

Was it helpful?

Solution

It sounds like your pseudocode is equivalent to the following:

  • Do search for low in the BST. Find this value or the next largest value. (assuming that the BST is somewhat balanced, this is $O(lg(n))$)
  • In-order traverse until we reach an element that is greater than high, adding all values that we traverse to some running sum. (in order traversal of $m$ elements in a BST is time $O(m)$ since each element is visited at most twice.)
  • The running sum gives us our answer. The total runtime of this is $O(lg(n) + m)$

This is equivalent to what you write in your psuedocode I believe. The ignoring of subtrees that are not in the range of interest is what is done in a search.

Hope this helps!


For a proof: First you should show that this algorithm is complete, i.e. show that it gets you the answer you want.

Then all you need to do is argue the asymptotic bounds. These arguments should be based on the fact that this is a BST, e.g. a search in a BST is in the worst case $O(h)$ where $h = O(lg(n))$ is the height of the tree because at each step in the search, you go up one level in the tree.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top