As the name implies, foldTree
defines the fold function over the custom Tree
type.
A naive way of defining a foldTree
could be:
let rec foldTreeNaive accFun init = function
| Empty -> init
| Branch (x, left, right) ->
let lacc = foldTreeNaive accFun init left
let racc = foldTreeNaive accFun init right
accFun x lacc racc
The problem with this function is that it could potential make very deep recursive calls if the tree being folded over is deep, since the recursive calls must complete for a node before the accumulator function can be called. For example the following causes a stack overflow exception:
let unbalanced = [1..100000] |> List.fold (fun t i -> Branch(i, t, Empty)) Empty
let nodeCount = foldTreeNaive (fun _ lc rc -> lc + rc + 1) 0 unbalanced
The usual way to avoid such stack overflows is to make the function tail recursive, however this seems impossible in this case since there are two recursive calls to make, instead of the one required when folding over lists.
foldTree
is defined using the local loop
function. This function is interesting in that it is defined using continuation passing style. In CPS, each function takes an additional 'continuation' function which is passed the result of the computation and is responsible for deciding what happens next. Note that loop
is tail recursive and so avoids the overflow problem of foldTreeNaive
.
The type of the loop
function is:
Tree<'a> -> ('b -> 'c) -> 'c
where 'a is the type of nodes in the tree, 'b is the accumulator type, and 'c is the result of the continuation function.
In the case of a leaf node, the continuation is passed the empty accumulator value passed to the foldTree
function.
When folding over a non-empty tree in the Branch
case, the result of the fold depends on the results for the left and right subtrees. This is done recursively, first by folding over the left subtree, then the right. For the recursive call over the left subtree, loop
must build a new continuation to receive the result, this is the
(fun lacc ->
loop right (fun racc ->
cont (branchF x lacc racc))
function. What this continuation does is to make a recursive call over the right subtree, passing yet another continuation to receive the result of that fold. When that continuation is called, the results for the left and right subtrees are available in lacc
and racc
. At this point the accumulation function for the node can be called with the value for the current node and the results for the left and right subtrees. The result of this function is then passed to the original continuation passed to loop
.
The loop
function is then invoked by the foldTree
function in the line:
loop t id
Here, id
is the continuation which will receive the result of the fold for the root node of the tree. Since this is the value required, id
just returns its argument without modification.
You might also find this description of fold for binary trees useful.