Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for computing partial orderings of dependency graphs

I'm trying to compute a partial "topological sort" of a dependency graph, which is actually a DAG (Directed Acyclic Graph) to be precise; so as to execute tasks without conflicting dependencies in parallel.

I came up with this simple algorithm because what I found on Google wasn't all that helpful (I keep finding only algorithms that themselves run in parallel to compute a normal topological sort).

visit(node)
{
    maxdist = 0;
    foreach (e : in_edge(node))
    {
        maxdist = max(maxdist, 1 + visit(source(e)))
    }
    distance[node] = maxdist;
    return distance[node];
}

make_partial_ordering(node)
{
    set all node distances to +infinite;
    num_levels = visit(node, 0);
    return sort(nodes, by increasing distance) where distances < +infinite;
}

(Note that this is only pseudocode and there would certainly be a few possible small optimizations)

As for the correctness, it seems pretty obvious: For the leafs (:= nodes which have no further dependency) the maximum distance-to-leaf is always 0 (correct because the loop gets skipped due to 0 edges). For any node connected to nodes n1,..,nk the maximum distance-to-leaf is 1 + max{distance[n1],..,distance[nk]}.

I did find this article after I had written down the algorithm: http://msdn.microsoft.com/en-us/magazine/dd569760.aspx But honestly, why are they doing all that list copying and so on, it just seems so really inefficient..?

Anyway I was wondering whether my algorithm is correct, and if so what it is called so that I can read up some stuff about it.

Update: I implemented the algorithm in my program and it seems to be working great for everything I tested. Code-wise it looks like this:

  typedef boost::graph_traits<my_graph> my_graph_traits;
  typedef my_graph_traits::vertices_size_type vertices_size_type;
  typedef my_graph_traits::vertex_descriptor vertex;
  typedef my_graph_traits::edge_descriptor edge;

  vertices_size_type find_partial_ordering(vertex v,
      std::map<vertices_size_type, std::set<vertex> >& distance_map)
  {
      vertices_size_type d = 0;

      BOOST_FOREACH(edge e, in_edges(v))
      {
          d = std::max(d, 1 + find_partial_ordering(source(e), distance_map));
      }

      distance_map[d].insert(v);

      return d;
  }
like image 732
d3ddev Avatar asked Feb 15 '11 14:02

d3ddev


People also ask

What is Kahn's algorithm?

Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a graph is not possible if the graph is not a DAG.

Which of the following algorithm is used for topological sorting?

Algorithm to find Topological Sorting: We recommend to first see the implementation of DFS. We can modify DFS to find Topological Sorting of a graph. In DFS, we start from a vertex, we first print it and then recursively call DFS for its adjacent vertices. In topological sorting, we use a temporary stack.

Which graph algorithm do build tools like make use to efficiently determine the order in which to build dependencies?

Which of the following standard graph algorithms is used by Make. Explanation: Make can decide the order of building a software using topological sorting. Topological sorting produces the order considering all dependencies provide by makefile.


1 Answers

Your algorithm (C++) works, but it has very bad worst-case time complexity. It computes find_partial_ordering on a node for as many paths as there are to that node. In the case of a tree, the number of paths is 1, but in a general directed acyclic graph the number of paths can be exponential.

You can fix this by memoizing your find_partial_ordering results and returning without recursing when you have already computed the value for a particular node. However, this still leaves you with a stack-busting recursive solution.

An efficient (linear) algorithm for topological sorting is given on Wikipedia. Doesn't this fit your needs?

Edit: ah, I see, you want to know where the depth boundaries are so that you can parallelize everything at a given depth. You can still use the algorithm on Wikipedia for this (and so avoid recursion).

First, Do a topological sort with the algorithm on Wikipedia. Now compute depths by visiting nodes in topological order:

depths : array 1..n
for i in 1..n
    depths[i] = 0
    for j in children of i
        depths[i] = max(depths[i], depths[j] + 1)
return depths

Notice that there's no recursion above, just a plain O(|V| + |E|) algorithm. This has the same complexity as the algorithm on Wikipedia.

like image 177
rlibby Avatar answered Sep 17 '22 16:09

rlibby