Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing connectivity queries in a Directed Acyclic Graph

This is one of the personal projects that I am working on. I have a DAG of N nodes ( say million ) that I will querying for connectivity of two nodes [ isConnected(a,b} ]. I will be querying the DAG online M ( say million ) times. Is there a way to optimize the process?

Here are the best approaches that I could come up with.

BFS = O( M * N )

Dijkstra = O( M* E* log N) where E is the number of edges in the graph.

Is there any other better approach for this process ? I am right now using the second strategy. This takes forever in my system.

like image 358
Crusher Avatar asked Jan 22 '26 05:01

Crusher


1 Answers

I can see some speed-ups, but I cannot solve the problem efficiently in general.

First, consider the graph as an undirected graph and split it out into connected components. Two points not in the same connected component must be disconnected.

For each connected component, consider it again as a directed graph and split it into strongly connected components. Two points in the same strongly connected component must be connected.

Now consider each connected component as a DAG of nodes, where each node is actually a strongly connected component. You can topologically sort this. If node A is upstream from node B then there is no path against the flow from B to A.

Your DAG probably isn't a tree, but if it is you can solve the Lowest Common Ancestor problem problem on it efficiently. The only path from A to B goes up to its lowest common ancestor and back down again. If both those routes are in the right direction for a topological sort then you can get from A to B.

For a completely different approach, a google search will find many algorithms for fast shortest path in large networks. It is possible that some of these may be applicable.

like image 191
mcdowella Avatar answered Jan 23 '26 20:01

mcdowella