What is the height of an empty BST when using it in context for balancing?

https://cs.stackexchange.com/questions/335

16-10-2019
|

Pergunta

Suppose the datatype for a BST is defined as follows (in SML)

datatype 'a bst_Tree =
   Empty
 | Node of (int * 'a) * 'a bst_Tree * 'a bst_Tree;

So there are two cases one in which the BST is Empty or it can have a (key,value) as well as two children.

Now, for the case of an AVL where the condition is

In an AVL tree, the heights of the two child subtrees of any node differ by at most one
_{- AVL tree Wikipedia}

I want to able to create a height function for use to check whether the tree is balanced. My current setup is as follows

fun height (Empty) = ~1
  | height (Node(v, Empty, Empty)) = 0 (* Redundant matching because of third case *)
  | height (Node(v, L, R)) = 1 + Int.max(height(L),height(R))

I tried to separate the Tree into three conditions

A empty Tree
A Tree with a root node
A populated tree

The reason for this is that there does not seem to be a canonical source on what the value is for the height of an Empty Tree as opposed to one in which only has a root. For the purposes of my balance function it did the job, but I rather try to understand why there isn't a canonical answer for the height of an Empty Tree.

There is a canonical answer, in a matter of speaking on Wikipedia but while initially doing research on this on Stack Overflow I arrived at many comments stating this to be wrong/incorrect/unconventional

Conventionally, the value −1 corresponds to a subtree with no nodes, whereas zero corresponds to a subtree with one node.)

I grabbed the question from which my uncertainty appeared

What is the definition for the height of a tree?

I think you should take a look at the Dictionary of Algorithms and Data Structures at the NIST website. There definition for height says a single node is height 0.

The definition of a valid tree does include an empty structure. The site doesn't mention the height of such a tree, but based on the definition of the height, it should also be 0.

Solução

The height of a rooted tree is defined as the length of the longest simple leaf-to-root path

In the case of a 2 node tree, then clearly this length is 1.

In the case of a 1 node tree (a tree with just the root) then the length has to be 0. (There are 0 edges on the path from the leaf to the root.)

In the case of a 0 node tree, well -1 doesn't make sense. It is a distance and must have a value $\geq 0$, but neither does it make sense to try and measure the length of null.

-1 is sometimes chosen because of its consistency with this recurrence relation for the height of nodes: (which is basically the same as what you posted above)

height(null) = -1
height((v, left, right)) = max(height(left), height(right)) + 1

but we could just as easily define:

height((v, null, null)) = 0
height(null) = 0
height((v, left, right)) = max(height(left), height(right)) + 1

Outras dicas

Any constant works as the height of Empty, because only the difference of heights is important for balancing. So, why not choose 0?

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange