Question

When creating a suffix tree of the string "ABAB" I get only 2 nodes:

ABAB and BAB

The longest repeatead substring ("AB") should be located by "the deepest node with at least k descendants" but this is not the case with my string, what's wrong here?

Thanks

Was it helpful?

Solution

If you are using some form of suffix tree which has only two nodes for the string ABAB, then it won't work directly with the algorithm you quoted. This is what the suffix tree should look like, with O representing the nodes and $ used to mark the end of the string.

         O
        / \
       /   \
      B     AB
     /       \
    O         O
   / \       / \
  $  AB$    $  AB$
 /     \   /     \
O       O O       O

The key feature here (and which is missing from the tree you are using) is that each leaf node corresponds to a suffix of the string.

The deepest node with at least two leaf descendants is at path AB (the depth is the substring length needed to reach that node from the root, in this case 2), and that is indeed the longest repeated substring.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top