Question

So I have been looking into implementing a lowest common ancestor algorithm. I looked at many different algorithms (mainly variations of Trajan's solution or variations of the RMQ).

I am using a non-binary tree. My tree will often change between queries and therefore pre-processing wouldn't necessarily be worthwhile. The tree shouldn't have more than 50-75 nodes. What I am wondering is whether I should bother using their algorithms or just stick with my own.

My Algorithm

myLCA(node1, node2) {
    parentNode := [ ]
    while (node1!=NULL) {
         parentNode.push(node1)
         node1 := node1.parent
    }
     while (node2!=NULL) {
         for i in parentNode.size {
             if (parentNode(i) == node2) {
                 return node2; 
             }
         }
         node2 := node2.parent
     }

}       
Was it helpful?

Solution

As others have mentioned, your algorithm is currently quadratic. That may not be a problem for a dataset as small as 50-75 nodes, but in any case it's straightforward to change it to linear time without using any sets or hashtables, just by recording the complete path to the root for each node, then walking back down from the root and looking for the first different node. The immediately preceding node (which is the common parent of these 2 different nodes) is then the LCA:

linearLCA(node1, node2) {
    parentNode1 := [ ]
    while (node1!=NULL) {
         parentNode1.push(node1)
         node1 := node1.parent
    }
    parentNode2 := [ ]
    while (node2!=NULL) {
         parentNode2.push(node2)
         node2 := node2.parent
    }
    while (node1 == node2 && !isEmpty(parentNode1) && !isEmpty(parentNode2)) {
        oldNode := node1
        node1 := parentNode1.pop()
        node2 := parentNode2.pop()
    }
    if (node1 == node2) return node1    // One node is descended from the other
    else return oldNode                 // Neither is descended from the other
}

EDIT 27/5/2012: Handle the case in which one node is descended from the other, which would otherwise result in attempting to pop() an empty stack. Thanks to damned for pointing this out. (I also realised that it's sufficient to track a single oldNode.)

OTHER TIPS

For a tree that small, I wouldn't bother implementing anything more complex. Your solution looks good, although the time complexity is squared in terms of the height of the tree. If you can easily implement a Set (most languages have it built-in), then the algorithm could be tweaked to,

  1. Traverse from the first node up to the root and collect all nodes in a set
  2. Traverse from the second node up to the root and check if the current node exists in that set. If it does, then that is the common ancestor.

Also, this algorithm assumes that a node can be its own ancestor. Otherwise, you would have to tweak the algorithm slightly. Consider this example,

A
|
B
|
C

When trying to find the lowest common ancestor of B, and C, this algorithm would report B, which may or may not be true depending on how you define ancestor.

Without looking at the details of either algorithm, I'd suggest looking at how important the efficiency of this algorithms is to your overall application, and how much effort would be required to implement another algorithm.

How many times will this algorithm be run in normal (or stressed) operation of your application? Will it cause the user to wait longer than necessary? Are the other algorithms of a different order of magnitude faster than yours? (Someone familiar with the algorithms can give you a more detailed answer on this.)

I think it is not worth optimising a bit of code unless you will see sizable results (some people feel quite strongly that premature optamisation is the root of all evil)

Your algorithm is quadratic, but it can easily be made linear.

Just use hashtable (i.e. set) for parentNode, instead of list. Thus checking whether a node is in parentNode will be O(1) instead of O(n).

I just wrote a blog post about how I had to implement my own algorithm for this problem but extended to a set of nodes with an arbitrary length. You can find it here (with a step-by-step graphical explanation of how it works)

http://bio4j.com/blog/2012/02/finding-the-lowest-common-ancestor-of-a-set-of-ncbi-taxonomy-nodes-with-bio4j/

Cheers,

Pablo

I have one simplistic solution sort the two elements and the lowest be left and highest be right visit root def recurse(root) return nil if root.empty? if left <= root && right >= root return root elsif left <= root && right <= root recurse(root.left) else recurse(root.right) end

So this will check on each traversal The problem soluted in O(log n) time for average and worst and O(log

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top