سؤال

I'm looking for an algorithm to compare two trees to each other and check if the on tree is a subtree of the other.

I first provided an own tree implementaion, which has the following structure:

public class PlatformTree {
   private HwNode root;

   .......
}

public class HwNode {
  private HwNode parent;
  private ElementType elementType;
  private Hardware hw;
  private List<Node> children = new ArrayList<Node>();

  public Node(Node parent, ElementType elementType, Hardware hw) {
     ...
  }

  public void addChild(Node child) {
     children.add(child);
  }

  ....

The following Images should give you an overview:

Tree1

Tree1

Tree2

enter image description here

As displayed in the image, I want to check if all subtrees of Tree1 are contained in Tree2. The root items of the trees are only dummy elements, which are used to access the subtrees.

I always check if subtrees of Tree1 are contained in Tree2. Nodes of type Tree1 always have one successor (Microcontroller.Memory), while Nodes of type Tree2 can have several successors (Microcontroller.Memory/Memory/Memory).

The ElementType determine if Nodes are equal or not.

The comparison method should return True, if all subtrees are contained in the other tree. Otherwise it should return False. I trie now a long time to provide an algorithm, but I still have some problems with the recursion calls, I reckon. That's what I've done so far:

Class TreePlatform:

public boolean containsSubTree(PlatformTree tree1) {
        Boolean b = true;

        // check all subtrees of Tree1
        if (tree1.getRoot() != null && getRoot() != null) {
            for (HwNode subRootNode : tree1.getRoot().getChildren()) {
                b = getRoot().containsSubTree(subRootNode);
            }
        } else {
            return false;
        }
        return b;
    }

Class HwNode

public boolean containsSubTree(HwNode abstractNode) {
    for (HwNode subRootNode : getChildren()) {
        if (hasSubTree(subRootNode, abstractNode)) {
            return true;
        }
    }
    return false;
}

private boolean hasSubTree(HwNode subRootNode, HwNode abstractNode) {
    if (subRootNode.getElementType() != abstractNode.getElementType()) {
        return false;
    }
    if (!abstractNode.getChildren().isEmpty()) {
        if (!subRootNode.getChildren().isEmpty()) {
            HwNode abstractSubNode = abstractNode.getChildren().get(0);
            for (HwNode subNode : subRootNode.getChildren()) {
                if (!hasSubTree(subNode, abstractSubNode)) {
                    return false;
                }
            }
        } else {
            return false;
        }
    } else if (subRootNode.getChildren().isEmpty()) {
        return true;
    }
    return true;
}

I'm not happy with my algorithm yet, but I'm stucked at the moment and don't know how to solve this. Any ideas how I can compare both tree to each other?

هل كانت مفيدة؟

المحلول

Aight, so the question is about looking for one tree inside another. This is a standard interview problem if the tree has an ordering, but yours doesn't and that makes matters significantly more complicated. Similarly, if we were looking for tree congruence rather than containment, and if we could use Kelly's Theorem as Adam Gent indicates in the OP's comments and just count subtrees.

For clarity's sake, I'm going to call the two trees the pattern tree and the target tree. We want to test whether the pattern tree is a subtree of the target tree.

Your approach is (though I might be understanding) that a node P from the pattern tree is equivalent to a node T in the target tree iff every child of P is equal to a node in T. Thing is, this has a problem: what if P has two identical subtrees as children? It'd match against a node T that only had one subtree!

Now this isn't a fatal flaw: we could adapt this approach by deleting a subtree of T when we think we've found a match, and then doing a pile of backtracking if it later turns out we're wrong. That's kinda prone to combinatorial explosion though.

A better way is to recurse from the ground up rather than the top down. What I mean by this is

  • Check whether all the pattern tree's leaves occur in the target tree.
  • If they do, build a dictionary D mapping the pattern tree's leaves to lists (there might be more than one!) of matching leaves in the target tree.
  • Check whether all height-1 subtrees of the pattern tree occur in the target tree.
  • To do this, suppose we have a height-1 node Q from the pattern tree, with leaves X and Y.
  • Use the dictionary to get the lists of leaves in the target tree matching X and Y. Filter these lists so that any leaf in the target tree appears at most once (this deals with the case that X is identical to Y)
  • See if some leaf matching X in the target tree has a parent in common with a leaf matching Y in the target tree.
  • If so, hooray! If this parent is the same type as Q, it's a match! Find all such parents - we'll need them for the next step.
  • If all the height-1 subtrees of the pattern tree do indeed appear in the target tree, build a new dictionary mapping them to lists of matching height-1 trees in the target tree.
  • Repeat, building a height-2 tree dictionary using the height-1 dictionary.
  • Repeat, building a height-3 tree dic- you get the point. Eventually you'll reach the root of the pattern tree and you're done.

This works out at quadratic time, which isn't great but the usual Googling didn't reveal a better algorithm (mostly because the binary tree/ordered tree variants have flooded the search space). If anyone knows of better, I'd be glad to hear it. Using Kelly's Theorem against all subtrees works out at cubic time naively I think, but with some memoization that'd probably come down to quadratic too.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top