Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A variation of the "Find common ancestor"

I recently gave an phone interview.It involved coding of a problem as part of the process.
The problem was a variation of the Find the most closest common ancestor of a tree but with a twist. The tree was much graph-like i.e. child nodes could be connected. Example:

     A   
   /  
  B 
  | \ 
  C  E  
  |  |  
  D  F  
  \  /  
   G   

In this case given this tree and the nodes F and D the resulting closest common ansestor would be B. The second twist was that the tree was presented as an array. The method to implement had the following input:
public String getCA(String[] nodes, String[][] parentNodes, String targetNode1, String targetNode2)
In this example nodes = {"G", "F", "E", "D", "C", "B", "A"} and parentNodes = {{"F","D"},{"E"}, {"B"}, {"C"}, {"B"}, {"A"}, null}
Essentially for nodes[i] the parent(s) is parentNodes[i].
To be honest I panicked completely (was already pretty nervous) and took me really really long time to figure out an answer.
Although I think this is solved recursively I somehow came up with an iterative solution which as far as I can tell works:I push nodes in queue and go up the path first for the first target node and then the second node. Once I find an already encountered node I consider it as the solution (have added comments to clear things up).

public String getCA(String[] nodes, String[][] parentNodes, String targetNode1, String targetNode2) {  
        if(nodes == null || parentNodes == null){  
            throw new IllegalArgumentException();  
        }  

        Map<String, String[]> map = new HashMap<String, String[]>();  
        for(int i = 0; i < nodes.length; i++){  
            map.put(nodes[i], parentNodes[i]);  
        }  
        //These are the parents visited as we go up  
        Set<String> parentsSeen = new HashSet<String>();  
        parentsSeen.add(targetNode1);  

        Queue<String> path = new LinkedList<String>();  
        String[] parents = map.get(targetNode1);  
        //The root is the common parent  
        if(parents == null){  
            return targetNode1;  
        }   

        //Build the path up  
        for(String parent:parents){  
            path.add(parent);  
        }  
        while(!path.isEmpty()){  
            String currentParent = path.remove();  
            parentsSeen.add(currentParent);  
            parents = map.get(currentParent);  
            if(parents == null){  
                continue;   
            }  
            for(String parent:parents){  
                path.add(parent);  
            }  
        }  

        parents = map.get(targetNode2);  
        //It is the root, so it is the common parent  
        if(parents == null){  
            return targetNode2;  
        }  
        //Start going up for the targetNode2. The first parent that we have already visited is the common parent  
        for(String parent:parents){  
            if(parentsSeen.contains(parent)){  
                return parent;  
            }  
            path.add(parent);  
        }  

        while(!path.isEmpty()){  
            String currentParent = path.remove();  
            if(parentsSeen.contains(currentParent)){  
                return currentParent;  
            }             
            parents = map.get(currentParent);  
            if(parents == null){  
                continue;  
            }  
            for(String parent:parents){  
                path.add(parent);  
            }  
        }  
        return null;            
    }

I did not get a move forward call. Now due to the fact that I am "self-taught" I would be interested to understand how I messed up here. Due to the fact that this is technical issue, I believe it is not subjective matter and hopefully I could get help here from experienced people.
So as collegue programmers how would you handle the problem and how do you evaluate my solution? What do I need to do to improve my skills?
You can be as straightforward as you possible. As long as I can understand what went wrong and learn I am satisfied

like image 677
Cratylus Avatar asked Aug 27 '12 19:08

Cratylus


5 Answers

It's not even clear to me what "most closest" means here. Consider the following graph:

    I
   /\
  /  \
 /    \
H      E
|\    /|
| \  / |
G  \/  D
|  /\  |
| /  F C
|/    \|
A      B

Here there are 2 common ancestors of A and B, H and E. H is distance 2 from both A and B. E is distance 1 from A but distance 3 from B. Which do I choose?

Furthermore, no matter what your answer to that question, finding the set of ancestors from one and then doing a BFS from the other doesn't work. Finding all the ancestors of A and then doing a BFS from B finds H first, and finding all the ancestors of B and then doing a BFS from A finds E first. As an adversary, I can switch A and B to make your algorithm fail with respect to whatever choice you make (the choice of whether 2/2 or 1/3 is better).

So the correct algorithm must be more complicated than just an ancestor set calculation plus a BFS. And unless you tell me how to make that choice, I'm not sure I can pin down a correct algorithm.

like image 112
Keith Randall Avatar answered Oct 19 '22 03:10

Keith Randall


Very nice question, you really put effort into writing this. I'm sorry about the interview, I know it must be really frustrating when things like this happen. Let's see about the problem.

An idea that comes to mind is this: you can go recursively upwards until you reach the root from both target nodes and build two lists. At the end, simply find the first common node in both lists.

public static List<String> visitUpwards(Map<String, String[]> mapping, String current) {
    List<String> list = new ArrayList<String>();
    list.add(current);
    if(mapping.get(current) == null) return list;
    else {           
       for(String s: mapping.get(current)) {              
          list.addAll(visitUpwards(mapping, s));                            
       }
       return list;
    }
}  

Edit: On second thought, this was not a very good idea, since it basically does a DFS search upwards, making finding the first common ancestor difficult. A better idea is to run a BFS for each target instead (this may seem similar to your solution :D):

public static Set<String> BFS(Map<String, String[]> mapping, String node) {
    Queue<String> queue = new LinkedList<String>();
    Set<String> result = new LinkedHashSet<String>();
    queue.add(node);
    while(!queue.isEmpty()) {
        String current = queue.remove();
        result.add(current);
        if(mapping.get(current) != null)
            for(String s: mapping.get(current)) {
                queue.add(s);
            }
    }
    return result;
}

Then use another map to find the first common ancestor:

Set<String> list1 = BFS(mapping, "D");      
Set<String> list2 = BFS(mapping, "G");
System.out.println(list1);
System.out.println(list2);

Map<String, Boolean> src = new HashMap<String, Boolean>();

for(String s: list1) {
    src.put(s, true);
}

for(String s: list2) {
    if(src.get(s) != null && src.get(s)) {
        System.out.println(s);
        break;
    }
}
like image 25
Tudor Avatar answered Oct 19 '22 03:10

Tudor


I can't tell you why the company didn't call you back. But your code would benefit greatly by extracting code into smaller, meaningful methods, rather than using a single, big method with all that detailed, low-level logic. And although I haven't gone so far as to try compiling your code and running it on some test cases, just from a quick reading, I'm not at all convinced that the code is correct. It's possible that the closest comment ancestor could be targetNode1 or targetNode2 (say if targetNode2 was the parent of targetNode1). (I guess this depends on the details of how you define "closest common ancestor" -- you should have clarified what it means in that case.)

  • Using a map from node to parents is a good idea. But you can extract the code which builds that map into a small method, with a meaningful name.
  • You can define a method which implements a breadth-first search through a graph (which is essentially what you are doing here). Use that same method on targetNode1 and targetNode2.
like image 30
Alex D Avatar answered Oct 19 '22 05:10

Alex D


A more optimal solution could be to use an array of booleans for bookkeeping. One for each node. Init them all to false. Set them to true when you visit a node. Alternate your way upward the "tree" just as you're doing today. Once you come to an already visited node (set to true in the array) then you found the common ancestor.

EDIT: A better approach would be to have an array of integers with a unique identifier for each origin node. If there are loops we need to know who has already visited the node, it might be ourselves.

like image 37
Jens Agby Avatar answered Oct 19 '22 04:10

Jens Agby


Consider the following variation of your graph...

              A
              |
          X---B---Y
           \ / \ /
            C   E  
            |   |
            D   F
             \ /
              G

nodes = {"G", "F", "E", "D", "C", "B", "A", "X", "Y"}

parentNodes = {{"F","D"},{"E"}, {"Y","B"}, {"C"}, {"X","B"}, {"A"}, null, {"B"}, {"B"}}

I think your program would run into difficulity when it hits node "C" because it has two parent nodes to explore. You either need to resort to a recursive algorithm or manage some sort of a stack for unexplored nodes using an iterative algorithm.

Since you appear to have a bias toward iteraton you could try something similar to this:

Create an array (or similar data structure). Each array element holds three pieces of information:

 nodeName: string - name of a node from your `nodes` string
 pathLength[2]: integer -  array of minimum path length from refrence node (1 or 2).

Create stack. Each stack element contans three pieces of information:

 nodeName: string - name of a node to "explore"
 lengthToHere: integer - length of path from reference node to `nodeName`
 pathNumber: integer - 1 for first reference node, 2 for second.

Algorithm:

 Initialize array with nodeName from nodes and set each pathLength to "infinity"
 Push both starting reference nodes onto the stack: 

       push("F", 0, 1)
       push "D", 0, 2)

Repeat until stack is empty:
  - pop(nodeName, lengthToHere, pathNumber)
  - update pathLength[pathNumber] of element having nodeName in array with minimum of
            its current value and lengthToHere
  - for each parent of nodeName push(parent, lengthToHere + 1, pathNumber)

The stack is empty once all paths from starting reference nodes have been evaluated. If there is a common ancestor both pathLength values for a given nodeName will be less than infinity. Add these values together giving the total path length to this common ancestor. Report the nodeName with the smallest sum.

like image 21
NealB Avatar answered Oct 19 '22 04:10

NealB