Lowest Common Ancestor

March 21st, 2011 | by emontero |

Yet another PP challenge piqued my interest. Here’s the problem statement:

Today’s problem appears with some regularity at places like Proggit and Stack Overflow and in lists of programming interview questions:

Given a binary tree t and two elements of the tree, m and n, with m<n, find the lowest element of the tree (farthest from the root) that is an ancestor of both m and n.

For example, in the tree shown at right, the lowest common ancestor of 4 and 7 is 6, the lowest common ancestor of 4 and 10 is 8, and the lowest common ancestor of 1 and 4 is 3. It is possible for an element of the tree to be its own ancestor, so the lowest common ancestor of 1 and 3 is 3, and the lowest common ancestor of 3 and 6 is 3.

Your task is to write a function that finds the lowest common ancestor of two elements of a binary tree. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

When it comes to real life applicability (and sheer awesomeness!), binary trees are among the highest of the pack in the realm of data structures. Because of this innocuous little fact, I decided I simply had to solve this problem right away. Let’s dive right into it!

NOTE: The Java source code for the solution described below can be found here.

Step 1

We need to represent nodes and the tree itself first. Trees are recursive identities in nature. Namely, trees are made up of trees (subtrees), which are in time composed of more trees (child nodes), which, by definition, are also trees. For instance, analyzing the tree from the problem statement, we can say 3 is the root node for 1 and 6. 6 is the root node for 4 and 7. 4 is its own root node. If we take 3 and detach it from 8, we now have a full-blown tree. In order to model these relationships in code, a class like the following would suffice:

1
2
3
4
5
6
class Node{
     private Node left;
     private Node right;
     private int value;
     private String path;
}

What we’re basically saying with this trivial Java code is that each node can have two nodes (right and left). Nodes also have a value (specifically, an integer in this case) and a String representation of their paths to the root node. This austere model has everything we need in order to solve the problem at hand.

Step 2

Once we have an instance of our tree in place, we must find a way to traverse it and determine the path from the root node to every other node. The following snippet does the trick:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
static void traverse(Node node, Stack<Node> stack){
     stack.push(node);

     if(node.getRight() != null)
       traverse(node.getRight(), stack);

     if(node.getLeft() != null)
       traverse(node.getLeft(), stack);

     StringBuilder path = new StringBuilder();
     for(Node n : stack)
       path.append(n).append(" ");
     node.setPath(path.toString().trim());

     stack.pop();
     return;
}

The algorithm used here to visit each node is famously called DFS (depth-first search). Also, notice how we’re using a stack, a LIFO data structure, to keep visited nodes accessible when we decide to map a node’s path. After computing all the nodes’ paths in our tree, we now must find the lowest common ancestor for the two nodes given. This is rather easy once steps 1 and 2 are out of the way.

Step 3

After the traverse function is executed, we have all nodes’ paths in their respective path properties. For 4, this would be the result:

8 3 6 4

Now for 7:

8 3 6 7

And for 14:

8 10 14

Now, I know what you’re thinking: how is this useful, Elvis? Well, I’m glad you asked. The answer is we must merely check which of these space-separated elements are identical for two nodes up to a certain point (starting from left to right). That is to say, for two specific nodes, the rightmost match in two paths is the lowest common ancestor. For instance, let’s say we want to find out the lowest common ancestor for nodes 1 (path: 8 3 1) and 4 (path: 8 3 6 4). We’d line both Strings up like this:

8 3 1  
8 3 6 4

We’re now going to iterate the String with the fewer elements and compare each constituent to the other node’s path. In this case, because of 1′s path, it’d be 1 vs 4. Essentially, this is what we’d ask each time: are these two elements equal? If they are, we’ve found the lowest ancestor. We move on and compare the next elements in line. We do this until there aren’t anymore items to compare (i.e. we’ve ran out of elements in the shorter path). Here’s some pseudo code that exemplifies this process for nodes 1 and 4:

Step 1: 8 equals 8? Yes. 8 is the lowest ancestor.
Step 2: 3 equals 3? Yes. 3 is now the lowest ancestor.
Step 3: 1 equals 6? No. 3 is still the lowest ancestor.
Step 4: No more elements to compare. We're done.
Step 5: Answer: 3.

Elementary, right? The function getLowestCommonAncestor does exactly what’s described here.

Step 4?

That’s it! We don’t need anything else. A straightforward solution is possible in 3 steps. Can you do better? I recommend you read some of the solutions posted at PP. There are incredible implementations there (e.g. Remco’s amazing Haskell solution).

  1. 2 Responses to “Lowest Common Ancestor”

  2. By Lisandro Diaz on Apr 14, 2011 | Reply

    Nicely done!. Whats the running time of this algorithm ? i only see a DFS but the final portion of the solution isnt posted. Also why prefer a DSF over a BFS?

  3. By emontero on Apr 26, 2011 | Reply

    Since I’m loading the entire tree in memory, the running time is DFS’s worst case performance — i.e. O( | V | + | E | ). Namely, the more vertices and corresponding relationships in your tree, the more time this algorithm will take to finish (the same principle applies to the space complexity of this approach).

    I decided to use DFS because I visualized the problem as recursive in nature. You could still use BFS if you’re so inclined though. Of course, you’d have to use a queue instead of the implicit stack I’m using here. Time and space complexities would stay the same.

Post a Comment