Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Graph Algorithm To Find All Connections Between Two Arbitrary Vertices

People also ask

How do you get all the paths between two nodes on a graph?

If you want to all simple paths between two nodes, you can do it with DFS with "local" visited set (that deletes a node from the visited set when it tracks back). @GarethRees Assume there is a polynomial time (NOT pseudo polynomial) algorithm for k th shortest simple path between two nodes.

Which algorithm is used to determine whether there is a path between any two vertices in graph?

Approach: Either Breadth First Search (BFS) or Depth First Search (DFS) can be used to find path between two vertices.

What is the connection between two vertices in a graph called?

Edge. An edge is the mathematical term for a line that connects two vertices.

How do you find the number of paths between two vertices?

Algorithm: Create a recursive function that takes index of node of a graph and the destination index. Keep a global or a static variable count to store the count. Keep a record of the nodes visited using a visited array and while returning mark the current node to be unvisited to discover other paths.


It appears that this can be accomplished with a depth-first search of the graph. The depth-first search will find all non-cyclical paths between two nodes. This algorithm should be very fast and scale to large graphs (The graph data structure is sparse so it only uses as much memory as it needs to).

I noticed that the graph you specified above has only one edge that is directional (B,E). Was this a typo or is it really a directed graph? This solution works regardless. Sorry I was unable to do it in C, I'm a bit weak in that area. I expect that you will be able to translate this Java code without too much trouble though.

Graph.java:

import java.util.HashMap;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.Map;
import java.util.Set;

public class Graph {
    private Map<String, LinkedHashSet<String>> map = new HashMap();

    public void addEdge(String node1, String node2) {
        LinkedHashSet<String> adjacent = map.get(node1);
        if(adjacent==null) {
            adjacent = new LinkedHashSet();
            map.put(node1, adjacent);
        }
        adjacent.add(node2);
    }

    public void addTwoWayVertex(String node1, String node2) {
        addEdge(node1, node2);
        addEdge(node2, node1);
    }

    public boolean isConnected(String node1, String node2) {
        Set adjacent = map.get(node1);
        if(adjacent==null) {
            return false;
        }
        return adjacent.contains(node2);
    }

    public LinkedList<String> adjacentNodes(String last) {
        LinkedHashSet<String> adjacent = map.get(last);
        if(adjacent==null) {
            return new LinkedList();
        }
        return new LinkedList<String>(adjacent);
    }
}

Search.java:

import java.util.LinkedList;

public class Search {

    private static final String START = "B";
    private static final String END = "E";

    public static void main(String[] args) {
        // this graph is directional
        Graph graph = new Graph();
        graph.addEdge("A", "B");
        graph.addEdge("A", "C");
        graph.addEdge("B", "A");
        graph.addEdge("B", "D");
        graph.addEdge("B", "E"); // this is the only one-way connection
        graph.addEdge("B", "F");
        graph.addEdge("C", "A");
        graph.addEdge("C", "E");
        graph.addEdge("C", "F");
        graph.addEdge("D", "B");
        graph.addEdge("E", "C");
        graph.addEdge("E", "F");
        graph.addEdge("F", "B");
        graph.addEdge("F", "C");
        graph.addEdge("F", "E");
        LinkedList<String> visited = new LinkedList();
        visited.add(START);
        new Search().depthFirst(graph, visited);
    }

    private void depthFirst(Graph graph, LinkedList<String> visited) {
        LinkedList<String> nodes = graph.adjacentNodes(visited.getLast());
        // examine adjacent nodes
        for (String node : nodes) {
            if (visited.contains(node)) {
                continue;
            }
            if (node.equals(END)) {
                visited.add(node);
                printPath(visited);
                visited.removeLast();
                break;
            }
        }
        for (String node : nodes) {
            if (visited.contains(node) || node.equals(END)) {
                continue;
            }
            visited.addLast(node);
            depthFirst(graph, visited);
            visited.removeLast();
        }
    }

    private void printPath(LinkedList<String> visited) {
        for (String node : visited) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println();
    }
}

Program Output:

B E 
B A C E 
B A C F E 
B F E 
B F C E 

The National Institute of Standards and Technology (NIST) online Dictionary of Algorithms and Data Structures lists this problem as "all simple paths" and recommends a depth-first search. CLRS supplies the relevant algorithms.

A clever technique using Petri Nets is found here


Here is the pseudocode I came up with. This is not any particular pseudocode dialect, but should be simple enough to follow.

Anyone want to pick this apart.

  • [p] is a list of vertices representing the current path.

  • [x] is a list of paths where meet the criteria

  • [s] is the source vertex

  • [d] is the destination vertex

  • [c] is the current vertex (argument to the PathFind routine)

Assume there is an efficient way to look up the adjacent vertices (line 6).

     1 PathList [p]
     2 ListOfPathLists [x]
     3 Vertex [s], [d]

     4 PathFind ( Vertex [c] )
     5     Add [c] to tail end of list [p]
     6     For each Vertex [v] adjacent to [c]
     7         If [v] is equal to [d] then
     8             Save list [p] in [x]
     9         Else If [v] is not in list [p]
    10             PathFind([v])
    11     Next For
    12     Remove tail from [p]
    13 Return

Since the existing non-recursive DFS implementation given in this answer seems to be broken, let me provide one that actually works.

I've written this in Python, because I find it pretty readable and uncluttered by implementation details (and because it has the handy yield keyword for implementing generators), but it should be fairly easy to port to other languages.

# a generator function to find all simple paths between two nodes in a
# graph, represented as a dictionary that maps nodes to their neighbors
def find_simple_paths(graph, start, end):
    visited = set()
    visited.add(start)

    nodestack = list()
    indexstack = list()
    current = start
    i = 0

    while True:
        # get a list of the neighbors of the current node
        neighbors = graph[current]

        # find the next unvisited neighbor of this node, if any
        while i < len(neighbors) and neighbors[i] in visited: i += 1

        if i >= len(neighbors):
            # we've reached the last neighbor of this node, backtrack
            visited.remove(current)
            if len(nodestack) < 1: break  # can't backtrack, stop!
            current = nodestack.pop()
            i = indexstack.pop()
        elif neighbors[i] == end:
            # yay, we found the target node! let the caller process the path
            yield nodestack + [current, end]
            i += 1
        else:
            # push current node and index onto stacks, switch to neighbor
            nodestack.append(current)
            indexstack.append(i+1)
            visited.add(neighbors[i])
            current = neighbors[i]
            i = 0

This code maintains two parallel stacks: one containing the earlier nodes in the current path, and one containing the current neighbor index for each node in the node stack (so that we can resume iterating through the neighbors of a node when we pop it back off the stack). I could've equally well used a single stack of (node, index) pairs, but I figured the two-stack method would be more readable, and perhaps easier to implement for users of other languages.

This code also uses a separate visited set, which always contains the current node and any nodes on the stack, to let me efficiently check whether a node is already part of the current path. If your language happens to have an "ordered set" data structure that provides both efficient stack-like push/pop operations and efficient membership queries, you can use that for the node stack and get rid of the separate visited set.

Alternatively, if you're using a custom mutable class / structure for your nodes, you can just store a boolean flag in each node to indicate whether it has been visited as part of the current search path. Of course, this method won't let you run two searches on the same graph in parallel, should you for some reason wish to do that.

Here's some test code demonstrating how the function given above works:

# test graph:
#     ,---B---.
#     A   |   D
#     `---C---'
graph = {
    "A": ("B", "C"),
    "B": ("A", "C", "D"),
    "C": ("A", "B", "D"),
    "D": ("B", "C"),
}

# find paths from A to D
for path in find_simple_paths(graph, "A", "D"): print " -> ".join(path)

Running this code on the given example graph produces the following output:

A -> B -> C -> D
A -> B -> D
A -> C -> B -> D
A -> C -> D

Note that, while this example graph is undirected (i.e. all its edges go both ways), the algorithm also works for arbitrary directed graphs. For example, removing the C -> B edge (by removing B from the neighbor list of C) yields the same output except for the third path (A -> C -> B -> D), which is no longer possible.


Ps. It's easy to construct graphs for which simple search algorithms like this one (and the others given in this thread) perform very poorly.

For example, consider the task of find all paths from A to B on an undirected graph where the starting node A has two neighbors: the target node B (which has no other neighbors than A) and a node C that is part of a clique of n+1 nodes, like this:

graph = {
    "A": ("B", "C"),
    "B": ("A"),
    "C": ("A", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "D": ("C", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "E": ("C", "D", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "F": ("C", "D", "E", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
    "G": ("C", "D", "E", "F", "H", "I", "J", "K", "L", "M", "N", "O"),
    "H": ("C", "D", "E", "F", "G", "I", "J", "K", "L", "M", "N", "O"),
    "I": ("C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "O"),
    "J": ("C", "D", "E", "F", "G", "H", "I", "K", "L", "M", "N", "O"),
    "K": ("C", "D", "E", "F", "G", "H", "I", "J", "L", "M", "N", "O"),
    "L": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "M", "N", "O"),
    "M": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "N", "O"),
    "N": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "O"),
    "O": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"),
}

It's easy to see that the only path between A and B is the direct one, but a naïve DFS started from node A will waste O(n!) time uselessly exploring paths within the clique, even though it's obvious (to a human) that none of those paths can possibly lead to B.

One can also construct DAGs with similar properties, e.g. by having the starting node A connect target node B and to two other nodes C1 and C2, both of which connect to the nodes D1 and D2, both of which connect to E1 and E2, and so on. For n layers of nodes arranged like this, a naïve search for all paths from A to B will end up wasting O(2n) time examining all the possible dead ends before giving up.

Of course, adding an edge to the target node B from one of the nodes in the clique (other than C), or from the last layer of the DAG, would create an exponentially large number of possible paths from A to B, and a purely local search algorithm can't really tell in advance whether it will find such an edge or not. Thus, in a sense, the poor output sensitivity of such naïve searches is due to their lack of awareness of the global structure of the graph.

While there are various preprocessing methods (such as iteratively eliminating leaf nodes, searching for single-node vertex separators, etc.) that could be used to avoid some of these "exponential-time dead ends", I don't know of any general preprocessing trick that could eliminate them in all cases. A general solution would be to check at every step of the search whether the target node is still reachable (using a sub-search), and backtrack early if it isn't — but alas, that would significantly slow down the search (at worst, proportionally to the size of the graph) for many graphs that don't contain such pathological dead ends.


Here is a logically better-looking recursive version as compared to the second floor.

public class Search {

private static final String START = "B";
private static final String END = "E";

public static void main(String[] args) {
    // this graph is directional
    Graph graph = new Graph();
    graph.addEdge("A", "B");
    graph.addEdge("A", "C");
    graph.addEdge("B", "A");
    graph.addEdge("B", "D");
    graph.addEdge("B", "E"); // this is the only one-way connection
    graph.addEdge("B", "F");
    graph.addEdge("C", "A");
    graph.addEdge("C", "E");
    graph.addEdge("C", "F");
    graph.addEdge("D", "B");
    graph.addEdge("E", "C");
    graph.addEdge("E", "F");
    graph.addEdge("F", "B");
    graph.addEdge("F", "C");
    graph.addEdge("F", "E");
    List<ArrayList<String>> paths = new ArrayList<ArrayList<String>>();
    String currentNode = START;
    List<String> visited = new ArrayList<String>();
    visited.add(START);
    new Search().findAllPaths(graph, seen, paths, currentNode);
    for(ArrayList<String> path : paths){
        for (String node : path) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println();
    }   
}

private void findAllPaths(Graph graph, List<String> visited, List<ArrayList<String>> paths, String currentNode) {        
    if (currentNode.equals(END)) { 
        paths.add(new ArrayList(Arrays.asList(visited.toArray())));
        return;
    }
    else {
        LinkedList<String> nodes = graph.adjacentNodes(currentNode);    
        for (String node : nodes) {
            if (visited.contains(node)) {
                continue;
            } 
            List<String> temp = new ArrayList<String>();
            temp.addAll(visited);
            temp.add(node);          
            findAllPaths(graph, temp, paths, node);
        }
    }
}
}

Program Output

B A C E 

B A C F E 

B E

B F C E

B F E 

Solution in C code. It is based on DFS which uses minimum memory.

#include <stdio.h>
#include <stdbool.h>

#define maxN    20  

struct  nodeLink
{

    char node1;
    char node2;

};

struct  stack
{   
    int sp;
    char    node[maxN];
};   

void    initStk(stk)
struct  stack   *stk;
{
    int i;
    for (i = 0; i < maxN; i++)
        stk->node[i] = ' ';
    stk->sp = -1;   
}

void    pushIn(stk, node)
struct  stack   *stk;
char    node;
{

    stk->sp++;
    stk->node[stk->sp] = node;

}    

void    popOutAll(stk)
struct  stack   *stk;
{

    char    node;
    int i, stkN = stk->sp;

    for (i = 0; i <= stkN; i++)
    {
        node = stk->node[i];
        if (i == 0)
            printf("src node : %c", node);
        else if (i == stkN)
            printf(" => %c : dst node.\n", node);
        else
            printf(" => %c ", node);
    }

}


/* Test whether the node already exists in the stack    */
bool    InStack(stk, InterN)
struct  stack   *stk;
char    InterN;
{

    int i, stkN = stk->sp;  /* 0-based  */
    bool    rtn = false;    

    for (i = 0; i <= stkN; i++)
    {
        if (stk->node[i] == InterN)
        {
            rtn = true;
            break;
        }
    }

    return     rtn;

}

char    otherNode(targetNode, lnkNode)
char    targetNode;
struct  nodeLink    *lnkNode;
{

    return  (lnkNode->node1 == targetNode) ? lnkNode->node2 : lnkNode->node1;

}

int entries = 8;
struct  nodeLink    topo[maxN]    =       
    {
        {'b', 'a'}, 
        {'b', 'e'}, 
        {'b', 'd'}, 
        {'f', 'b'}, 
        {'a', 'c'},
        {'c', 'f'}, 
        {'c', 'e'},
        {'f', 'e'},               
    };

char    srcNode = 'b', dstN = 'e';      

int reachTime;  

void    InterNode(interN, stk)
char    interN;
struct  stack   *stk;
{

    char    otherInterN;
    int i, numInterN = 0;
    static  int entryTime   =   0;

    entryTime++;

    for (i = 0; i < entries; i++)
    {

        if (topo[i].node1 != interN  && topo[i].node2 != interN) 
        {
            continue;   
        }

        otherInterN = otherNode(interN, &topo[i]);

        numInterN++;

        if (otherInterN == stk->node[stk->sp - 1])
        {
            continue;   
        }

        /*  Loop avoidance: abandon the route   */
        if (InStack(stk, otherInterN) == true)
        {
            continue;   
        }

        pushIn(stk, otherInterN);

        if (otherInterN == dstN)
        {
            popOutAll(stk);
            reachTime++;
            stk->sp --;   /*    back trace one node  */
            continue;
        }
        else
            InterNode(otherInterN, stk);

    }

        stk->sp --;

}


int    main()

{

    struct  stack   stk;

    initStk(&stk);
    pushIn(&stk, srcNode);  

    reachTime = 0;
    InterNode(srcNode, &stk);

    printf("\nNumber of all possible and unique routes = %d\n", reachTime);

}