Question

I wrote the following prefix trie:

class TrieNode {
    char letter;
    HashMap<Character,TrieNode> children;
    boolean fullWord;

    TrieNode(char letter) {
        this.letter = letter;
        children = new  HashMap<Character, TrieNode>();
        this.fullWord = false;
    }
}

class Tree{
    static TrieNode createTree() {
         return (new TrieNode('\0'));
    }

    static void insertWord(TrieNode root, String word) {
        int l = word.length();
        char[] letters = word.toCharArray();
        TrieNode curNode = root;
        for (int i = 0; i < l; i++) {
            if (!curNode.children.containsKey(letters[i]))
                curNode.children.put(letters[i], new TrieNode(letters[i]));
            curNode = curNode.children.get(letters[i]);
        }
        curNode.fullWord = true;
    }
}

I would like to add DFS method in order to find the first node that has more than 1 child (so it will show me the longest common prefix).

I wrote this code:

static void DFS(TrieNode node) {
    for (TrieNode tn: node.children.values()){
       DFS(tn);
        if (node.children.size()>2)
        System.out.println("found the first vertex");
    }
}

But it doesn't work. What am I doing wrong?

Was it helpful?

Solution

Well, first we need to clarify that longest common prefix here means the longest common prefix of any two or more strings in the trie tree.

So, your DFS method won't work well because it simply traverse the whole tree and will output "found the first vertex" on visiting any node whose children.size() > 2 (It should be >=2 here)

What we want here is finding the longest common prefix only. So we need some extra information about which is the longest one. It's easy to see that in my example above:

          root      --depth: 0
           |
           a        --depth: 1
           |
           b        --depth: 2
          / \
         c   f      --depth: 3
        /|\
       d g k        --depth: 4
            \
             k      --depth: 5 

The longest common prefix node has children.size()>1 AND has max depth. In this case, it's node c

So here's one possible correct DFS:

static int max=-1;
static TrieNode maxNode=null;

static void dfs(TrieNode node, int depth){
    if(node.children.size()>1 && depth>max){
        max=depth;
        maxNode=node;
    }
    for (TrieNode tn: node.children.values())
        dfs(tn,depth+1);
}

public static void test(){
    TrieNode root = Tree.createTree();
    Tree.insertWord(root, "abcd");
    Tree.insertWord(root, "abcg");
    Tree.insertWord(root, "abckk");
    Tree.insertWord(root, "abf");
    dfs(root,0);
    System.out.println("Max node:"+maxNode.letter);
}

After running of dfs, the maxNode will hold the node which longest common prefix stops. it's node c in this case.

OTHER TIPS

Your Prefix Tree code seems good. At least it makes sense to me but I haven't checked all corner cases. The problem is you DFS method. Assuming we have the following strings which consist of your prefix tree:

- "abcd"
- "abcg"
- "abckk"
- "abf"

So the prefix tree should look like this:

              root
               |
               a
               |
               b
              / \
             c   f
            /|\
           d g k
                \
                 k

I think what you expect is that we can output the "found the first vertex" at node b (because obviously "ab" is the longest common prefix of the above four strings), however, your DFS doesn't work that way. It will go along the way

root->a->b->c->d then back to c, finding that c.children.size() >1 , then output.

Note that "2 or more" is >=2 or >1 which in your program is >2. I thought that should be a typo. To correct your DFS, simply modify it to check a node's children size first will work:

static boolean DFS(TrieNode node) {
    if (node.children.size()>1){
        System.out.println("found the first vertex on node:"+node.letter);
        return true;
    }
    for (TrieNode tn: node.children.values()){
        if(DFS(tn))
            return true;
    }
    return false;
}

Note to make the program stop on finding our node, I modify the return type of you DFS. Also, regarding on finding longest common prefix, DFS may not be a best choice here, the following code is better than DFS in terms of runtime complexity:

static void lcp(TrieNode node){
    TrieNode first = node;
    while(first.children.size()==1)
        first = first.children.values().iterator().next();
    System.out.println("found the first vertex on node:"+first.letter);
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top