Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging graphs in Graphviz

Tags:

graphviz

dot

I have a collection of digraphs encoded in DOT language, and i want to merge them into a single digraph where nodes with the same name in different input graphs are merged together.

For example given the following files:

1.dot:

digraph {
    A -> B
    A -> C
}

2.dot:

digraph {
    D -> E
    E -> F
}

3.dot:

digraph {
    D -> G
    G -> A
}

I would like to obtain the following result.dot:

digraph {
  subgraph {
    A -> B
    A -> C
  }
  subgraph {
    D -> E
    E -> F
  }
  subgraph {
    D -> G
    G -> A
  }
}

I tried to use gvpack but it renames duplicate nodes.

> gvpack -u 1.dot 2.dot 3.dot
Warning: node D in graph[2] %15 already defined
Some nodes will be renamed.
digraph root {
        node [label="\N"];
        {
                node [label="\N"];
                A -> B;
                A -> C;
        }
        {
                node [label="\N"];
                D -> E;
                E -> F;
        }
        {
                node [label="\N"];
                D_gv1 -> G;
                G -> A_gv1;
        }
}

I found a similar question on SO that suggest using sed to rename the renamed nodes, but that doesn't seem very clean.

Is there a way to merge the graphs the way i would like them?

like image 969
gotson Avatar asked Nov 08 '18 06:11

gotson


1 Answers

For exactly the situation you are describing, using the sample files you provide, there is a very simple answer using m4 - a standard GNU Linux tool that should be installed by default in most distributions.

Create a file merge123.m4 with this content:

digraph 123 {
define(`digraph',`subgraph')
include(1.dot)
include(2.dot)
include(3.dot)
}

and execute it with the command

m4 merge123.m4 > 123.dot

and the resulting 123.dot file will be

digraph 123 {

subgraph {
    A -> B
    A -> C
}

subgraph {
    D -> E
    E -> F
}

subgraph {
    D -> G
    G -> A
}

}

If you don't like the empty lines, close each line in the script with dnl (the builtin dnl stands for “Discard to Next Line”:), for example

include(1.dot)dnl

m4 is extremely useful as it adds features to graphviz that are really helpful for more involved projects; see also this SO question.

EDITED to answer the question in the comment:

If you need to include files and don't know their number and names, you have (at least) two options:

1) If the number of files is rather small and you know all names that they could possibly have, you can sinclude() them all:

digraph 123 {
define(`digraph',`subgraph')
sinclude(1.dot)
sinclude(2.dot)
sinclude(3.dot)
sinclude(4.dot)
sinclude(5.dot)
}

m4 will only include the files that actually exist, and not complain about the missing ones (the s means "silent").

2) If you produce a larger number of.dot files with unpredictable names, you will need to do some pre-processing. Create a shell script include.sh similar to this one

#!/bin/sh
# get *.dot files (or any pattern you like) into one place
ls *.dot > files.txt
# bring them into a format m4 likes
awk '{print "include(" $1 ")" "dnl"}' files.txt > includes.txt
#done

includes.txt now provides m4 with the necessary information:

include(1.dot)dnl
include(2.dot)dnl
include(3.dot)dnl

Now modify your merge.m4 file, enabling it to make use of the file list provided (I'm adding dnl here to avoid lots of empty space in the resulting merged file):

### merge dot files
digraph 123 {
define(`digraph',`subgraph')dnl
syscmd(`./include.sh')dnl
include(`includes.txt')dnl
}

In order to keep the resulting file separate from the input files, better use a different extension when merging:

m4 merge.m4 > merged.gv

which now looks like

### merge dot files
digraph 123 {
subgraph {
    A -> B
    A -> C
}
subgraph {
    D -> E
    E -> F
}
subgraph {
    D -> G
    G -> A
}
}
like image 191
vaettchen Avatar answered Oct 12 '22 06:10

vaettchen