Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Diff for Directed Acyclic Graphs

Tags:

I'm looking for an algorithm which can diff two Directed Acyclic Graphs(DAGs). That is, I'd like an algorithm which produces a sequence of deletions and insertions on the first DAG to produce the second DAG.

I'm not a hundred percent sure, but I think a longest common subsequence can be applied to the DAGs. I'm less concerned about the length of the resulting edit sequence (as long as it's short enough) and more concerned about the running time of the algorithm.

One complication is that none of my vertices are labelled except for a single root node. The root node is also the only node with zero in-edges. The graph's edges are labelled, and the 'data' in the graph is represented by the paths from the root to the leaves. This is similar to a trie but with a directed graph instead of a tree. Actually my graphs are quite similar to the directed acyclic word graph data structure.

Here's an example.

DAG1

DAG1

DAG2

DAG2

To get DAG 2, you simply add a vertex from the root to another vertex with label 'b'. From that vertex there is an edge to the final 'ac' vertex in DAG 1 and an edge to a new vertex whose label is 'd'. From that final vertex there is another edge to the 'ac' vertex in DAG 1. I'd post a link to the diff in DAG form, but I can't post more than two links.

Thanks and hope this is legible enough.

like image 276
Nomad010 Avatar asked May 14 '13 21:05

Nomad010


People also ask

How will you differentiate directed acyclic graphs from trees?

An acyclic graph is a graph without cycles (a cycle is a complete circuit). When following the graph from node to node, you will never visit the same node twice. This graph (the thick black line) is acyclic, as it has no cycles (complete circuits). A connected acyclic graph, like the one above, is called a tree.

What is the most significant difference between a directed graph and a directed acyclic graph?

In a directed graph, the edges are connected so that each edge only goes one way. A directed acyclic graph means that the graph is not cyclic, or that it is impossible to start at one point in the graph and traverse the entire graph.

Is DAG same as tree?

Tree is a special kind of graph that has no cycle so that is known as DAG (Directed Acyclic Graph).

What is directed acyclic graph used for?

A directed acyclic graph may be used to represent a network of processing elements. In this representation, data enters a processing element through its incoming edges and leaves the element through its outgoing edges.


2 Answers

This might be a bit too late but just for fun: Both of your DAGs can be expressed as matrices, with row index indicating the "from" vertex, and the column index indicating the "to" vertex, and the corresponding cell labeled with edge id. You can give vertex unique and random ids.

The next part is a bit tricky, because only your edges have meaningful label that maps from DAG1 to DAG2. Suppose you have a set of edges E* that are the intersect of labeled edges from DAG1 and DAG2, you will need to perform a series of row shift (move up or down) or column shift (move left or right) so position of all edges in E* in DAG1 and DAG2 maps to each other. Note that for a DAG represented in Matrix, shifting position of entire row or entire column still makes the representation equivalent.

The remaining operation would be to rename the vertex according to the mapped matrices, compare the two matrices, and identify the new edges and new vertex required (and edges and vertices that can be removed.

like image 62
firemana Avatar answered Sep 30 '22 14:09

firemana


How would your specific data representation show that edges c and x in your DAG 2 example terminate in the same vertex?

If we assume Wikipedia's general definitions of "directed graph", "vertex", and "edge", there is no such thing as an "unlabeled vertex" because without labeling them, there would be no way to describe edges, according to the definition there.

As is, it seems to me your question is impossible to answer. Please provide (1) a simple example of the input provided to the algorithm — a data structure describing each graph as a collection of vertices and edges — and the expected output in a similar way, and (2) a consistent way to distinguish if an edge or vertex in the first DAG is equivalent to one in the second DAG, implying no difference in that aspect of the graphs.

Perhaps your question is actually mostly about how to determine the labels for the vertices in each DAG in the input and how to best correlate them. Or, alternatively, perhaps labels are just a convenience to describe each graph and the question is actually seeking the minimal set of changes to describe a transformation of the structure of one graph to another.

That said, edges and vertices in a traditional, mathematical, definition of a graph are atomic. Each vertex or edge either exists or does not exist in any one graph, making the concept of a diff somewhat meaningless, or otherwise trivial to build, if we assume that an identical label for any specific vertex or edge represents the exact same vertex or edge in both graphs.

Such a trivial algorithm would basically just enumerate each vertex and edge in the two DAGs and add the appropriate operations to the diff, choosing only from the following operations:

add vertex v remove vertex v add edge e remove edge e switch direction for edge e 
like image 23
גלעד ברקן Avatar answered Sep 30 '22 14:09

גלעד ברקן