Thought process to solve tree based Dynamic Programming problems

https://cs.stackexchange.com/questions/119707

28-09-2020
|

Question

I am having a very hard time understanding tree based DP problems. I am fairly comfortable with array based DP problems but I cannot come up with the correct thought process for tree based problems and I was hoping somebody could please explain their thought process.

I will talk about my thought process behind array based problems and then explain my troubles with tree based DP problems.

My thought process for array problems

The way I think about DP in array based problems is as follows. Let us consider a problem like Minimum Path Sum. Here the objective is to get from the top left to bottom right positions in a matrix such that we minimize the cost of the path. We can only move left and right.

The way I would approach problems like this is as follows:

First I would construct a recurrence. In this case the recurrence is as follows

The recurrence is:

f(i, j) = a[i][j] // if i == m and j == n
f(i, j) = a[i][j] + f(i, j+1) // if i == m
f(i, j) = a[i][j] + f(i+1, j) // if j == n
f(i, j) = a[i][j] + Math.min( f(i, j+1), f(i+1, j) ) // Otherwise

Next I look at the equations which tells me the problem can be solved using DP as there are overlapping subproblems in f(i+1, j) and f(i, j+1). There is also an optimal substructure.
I can also tell the time/space complexity just by looking at the recurrence.
- Because we must compute all states which is all (i,j) pairs and because time per state is O(1) (adding a[i][j] to result) the time complexity is O(n^2).
- Looking at the recurrence, i depends only on i+1 and not on i+2, i+3 ... similarly j depends only on j+1 and not on j+2, j+3... so we can get away with using only 1 extra row (either i+1 or j+1) instead of the entire matrix so space complexity is O(n).

Hence I would come up with a n^2 time and n space solution. I can do this without any problems.

My thought process for tree problems

However I am having a hard time applying the same thought process to tree based DP problems. As an example let us consider the problem Diameter of Binary Tree where the objective is to find the longest path between any 2 nodes in the tree.

I can come up with a recurrence for this problem which is as follows:

f(n) = 0 // if n == null
f(n) = max( 1+height(n.left) + height(n.right),         // longest path passing through root
            f(n.left),                                  // longest path in left subtree
            f(n.right) )                                // longest path in right subtree

Because f(n.left) for example is computed by doing 1+height(n.left.left) + height(n.left.right) I can tell that DP must be used.

So my approach would be to create a cache of size 'n' that stores all the heights of the nodes. So the space complexity would be O(n).

However the optimal solution of this problem has a space complexity of O(1) and I am having a hard time figuring that out just by looking at the recurrence. How does the recurrence tell you that space complexity can be reduced and that O(1) space is enough and O(n) is not needed? How do you know what value(s) to store in this case? In array based problems I can get the answers to both these questions just by looking at the recurrence but for tree based dp it is not so obvious to me.

My questions:

What can you tell about this problem just by looking at the recurrence for the tree problem? Putting aside my own thought process, if I gave you this recurrence and nothing else what conclusions would you reach and how would you write the program? I am curious about your thought process.
For array based problems I can tell just by looking at the recurrence both how much space I needed to solve the problem AND what exactly I needed to store (I need to store values of row i+1 in min path sum and nothing else). How can I do the same for the tree problem?

Thanks.

Solution

Question 1

The reason why its O(1) space and not O(n) comes down to top down vs bottom up.

Let us first consider the array based problem - min path sum.

If you do it top down you will need O(n^2) space. Remember that when you do top down/memoization, all the state results need to be stored - its essentially just recursion with caching.

However when you do bottom up, you are building up to the solution. Only the state results needed to build up to the next state is required. So in the case of min path sum only states in (i+1, j) is required to build the states in (i, j). We can discard the rest of the states.

This is why when you do bottom up, you can reduce space complexity to O(n).

For the tree problem what you have described is the top-down approach. For the solution you want, you need the bottom-up approach.

Top-down approach for the tree problem

What you have out there is an pre-order traversal. It can be thought of as the same as top-down DP or memoization. And as we have seen in the above example to use top down DP, all the state results need to be stored in the cache.

Here the states you care about is the heights of the nodes so height of each node needs to be stored bringing the complexity to store states upto O(n).

So think about how the tree problem would be implemented top down - you have all the heights in a cache and you call it each time you need to access the height of a node.

So every time you get to a node, you use the cache to get the heights and then you do some simple arithmetic in constant time to get the diameter. In this case that arithmetic is height[n.left] + height[n.right].

Top down is simply recursion with caching so for the top down variant, you are storing only the heights to get a cache size of O(n). The code is as follows:

Map<TreeNode, Integer> dp = new HashMap<>();

public int diameterOfBinaryTree(TreeNode root) {
    if(root == null)
        return 0;
    else
        return Math.max(
            height(root.left, dp)+height(root.right, dp),
            Math.max( diameterOfBinaryTree(root.left), diameterOfBinaryTree(root.right) )
        );
}

private int height(TreeNode root, Map<TreeNode, Integer> dp) {
    if(dp.containsKey(root))
        return dp.get(root);

    if(root == null)
        return 0;
    else
        dp.put(root, 1 + Math.max(height(root.left, dp), height(root.right, dp)) );

    return dp.get(root);
}

Bottom-up approach for the tree problem

The bottom up approach to any tree problem uses post-order traversal instead of pre-order traversal. We construct the postorder traversal to compute the states we want to store in the cache - in this case it is the height.

Since we compute the children before computing the current node, for every node we visit, we have the height of the children available to us. Since the heights of nodes are available to us, for a given node we also have the diameter passing through that node available to us while we are processing the node.

So if we want the max diameter, we simply need to compute the diameter of the current node and then compare it with the max diameter seen so far - we do this for all nodes. All this is done within a single post-order traversal of the tree.

The code is as follows:

int maxDiameter = 0;

public int diameterOfBinaryTree(TreeNode root) {
    postorderHeight(root);
    return maxDiameter;
}

private int postorderHeight(TreeNode root) {
    if(root == null)
        return 0;
    int leftSubtreeHeight = postorderHeight(root.left);
    int rightSubtreeHeight = postorderHeight(root.right);
    
    maxDiameter = Math.max(maxDiameter, leftSubtreeHeight + rightSubtreeHeight);
    
    return 1 + Math.max(leftSubtreeHeight, rightSubtreeHeight);
}

Technically the space complexity of this is proportional to recursion stack size however cache size is constant.

Question 2

Ask yourself if the states you need to store can be reduced if you do bottom up (post-order). If yes do that - in this case yes we can obtain heights for free if we use post order.

OTHER TIPS

I don't know if this helps, but I also struggled a great deal with DP problems on trees, and what helped for me was considering some simpler problems first, and really do a bunch of exercises of this type. It also really helps to program them in (e.g.) Python so that you get some hands-on experience.

So, perhaps it is better to start with a simpler tree problem first, like (weighted) Vertex Cover or (weighted) Independent Set?

In both these cases (they are similar, but let us focus on Vertex Cover), the entire solution of a subtree can be summarized in the root of the subtree with two numbers: the optimal solution if the root of the subtree is in the vertex cover, and the optimal solution if the root of the subtree is not in the vertex cover (but covered from below).

So the plan is to store, for each node $u$, two numbers, $u_{\text{in}} = c_i$ and $u_{\text{in}} = c_o$, and in the end, we will look at what are the values in the root node $r$, i.e. $r_{\text{in}}$ and $r_{\text{out}}$.

The Algorithm

The algorithm is then quite simple. For the base case, starting at the leaves, in leaf node $v$, you store $v_{\text{in}} = w(v)$ and $v_{\text{out}} = 0$. Clearly, in the subtree with only one node, there are no edges to cover, so this is quite easy.

Then the rest of the algorithm is given a node $v$ with children $u_1, \dots, u_d$, we need to compute two cases.

$v$ is in the vertex cover: then the optimal solution is $v_{\text{in}} = w(v) + \sum_{u \in u_1 \dots u_d} \min(u_{\text{in}}, u_{\text{out}})$
$v$ is not in the vertex cover, so all children must be, hence $v_{\text{out}} = \sum_{u \in u_1 \dots u_d} u_{\text{in}}$

Now, the only thing we need to do is return the result of the root node $r$, which can either be in or not in the vertex cover, so we return $$ \min( r_{\text{in}}, r_{\text{out}}) $$

In other words, we found out what information we need in the "separator" of a subtree, and how can combine the information from the subtree with the rest of the tree.

Exercise: Solve Weighted Vertex Cover and Weighted Independent Set on trees.
Exercise 2: Solve Weighted Dominating Set, which needs one more state, namely in the dominating set, not in the dominating set but dominated, not yet dominated (dominated from above).

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange