Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pattern or algorithm to merge branches in tree structure?

I am trying to map a DAG (Directed Acyclic Graph) into the structure I show below.

Here is an example of the DAG I start from

enter image description here

where arcs always go from left to right.

I then revert the graph and span it into a tree with repeated nodes like this:

enter image description here

What I am looking for is some algorithm or pattern to achieve the following merged structure. (Note it is reverted again)

enter image description here

The goal is to generate an XML like this:

<root>
    <seq>
        <mod1/>
        <flow>
            <seq>
                <mod4/>
                <mod7/>
            </seq>
            <seq>
                <flow>
                    <seq>
                        <flow>
                            <mod4/>
                            <mod3/>
                        </flow>
                        <mod6/>
                    </seq>
                    <seq>
                        <flow>
                            <mod4/>
                            <mod3/>
                            <mod2/>
                        </flow>
                        <mod5/>
                    </seq>
                </flow>
                <mod8/>
            </seq>
        </flow>
    </seq>
</root>

I don't think it's relevant but I am parsing JSON to write an XML with JAVA 7. Boxes are web services and the arrows represent input and output parameteres, so for instance, Module 5 is called once Modules 1,2,3 and 4 have finished and their outputs are its inputs.

EDIT: Ok, here is another example with ten nodes. I hope this gives you a better appreciation of when I nodes are meant to be merged.

enter image description here

To answer @blubb, in this example we can see how "services" 8 & 9 have also been merged. Otherwise all the services they need to work (1,2,3,4,5 and 6) would be call twice with no need. The middle branch in the last sketch would be executed twice: once for 8 and once for 9.

like image 949
eskalera Avatar asked Apr 10 '13 14:04

eskalera


2 Answers

I do not know too much about tree data structures, which I imagine might be a good housing for the result, and I'm not too knowledgeable about the conversion to XML, but if I were given the data with the routes mapped out, for example,

1 4 7
1 2 5 8
1 3 5 8
1 4 5 8
1 3 6 8
1 2 5 9
1 3 5 9
1 4 5 9
1 3 6 9
1 2 10
1 2 5 10
1 3 5 10
1 4 5 10

then one way to merge nodes could be:

Take increasingly larger chunks from the end of each line and examine the
first cell to the left of them. Nodes are merged if matching right-side 
chunks flow to the same aggregated first cells on the left. Remove duplicate 
paths.


Explanation/example:


First pass (take end-cells, compare with aggregated first cells to the left of them):

  4   <- 7
  5,6 <- 8
  5,6 <- 9
  2,5 <- 10

The only nodes that can be merged are 8 and 9 since they both flow to the same aggregated cells (5,6).

Result of first pass:

1 4 7
1 2 5 (8,9) -- merged
1 3 5 (8,9)
1 4 5 (8,9)
1 3 6 (8,9)
1 2 5 (8,9)
1 3 5 (8,9)
1 4 5 (8,9)
1 3 6 (8,9)
1 2 10
1 2 5 10
1 3 5 10
1 4 5 10


Second pass (take end-cells + 1 cell, compare with aggregated first cells to the left):

  1      <- 4 7
  2,3,4  <- 5 (8,9)
  3      <- 6 (8,9)
  1      <- 2 10
  2,3,4  <- 5 10

None can be merged since no matching right-hand-side paths flow to the same aggregated first cells on their left.


Third pass (take end-cells + 2 cells, compare with aggregated first cells to the left):

  N/A    <- 1 4 7
  1      <- 2 5 (8,9)
  1      <- 3 5 (8,9)
  1      <- 4 5 (8,9)
  1      <- 3 6 (8,9)
  N/A    <- 1 2 10
  1      <- 2 5 10
  1      <- 3 5 10
  1      <- 4 5 10

Two merges are possible. Firstly: [2 5 (8,9)], [3 5 (8,9)], and [4 5 (8,9)] all flow to 1. Secondly: [2 5 10], [3 5 10], and [4 5 10] all flow to 1.

Result of third pass:

1 4 7
1 (2,3,4) 5 (8,9)
1 3 6 (8,9)
1 2 10
1 (2,3,4) 5 10

Looks a lot like the requested result to me. (Duplicate cells at the ends can be merged to single nodes, i.e., 1 on the left, and (8,9) and 10 on the right, as in eskalera's final sketch.)

like image 167
גלעד ברקן Avatar answered Oct 21 '22 13:10

גלעד ברקן


Finally, I found an algorithm that did the job. Here it is for all of you who tried to help me out:

First of all I built an inverted spanning tree from the DAG in sketch 1. So I started from modules 7 and 8, building the tree backwards and where modules are duplicated.

After that I create virtual nodes called FLOW and SEQUENCE and introduce them in the tree so that every MODULE node is a child of a SEQUENCE node. Spanning branches are SEQUENCE nodes which are childs of FLOW nodes. I think this is step is intuitive enough but the important idea is to understand that we need virtual nodes so we can close the FLOW nodes, which are the ones splitting from one node to more than one.

After that I go over the tree depth first, and for every module(we'll call it the driver) I compare its children with the children of the driver's siblings. If they don't match I keep going down with the grandchildren of the driver's siblings, so that, all branches coming out of the driver's siblings must pass through the same nodes as the driver does. Graphically this means that at some point, both nodes need the exact same modules.

If they match I clean the merging branches from the coinciding nodes downwards, which means I cut them off their parents. From there upwards it goes into a new SEQUENCE node together with the drivers SEQUENCE node, into the same FLOW node.

After going over the whole tree, as long as a merge has been made, we go over the tree again, this time with a greater relationship. This means that instead of comparing the driver's children, we compare the driver's great children.

The last step is obviously to revert the tree again.

There are some concepts I have left aside because of the intricacies the programming of these virtual nodes imply. Mainly due to all the parent-children-sibling relationship being lost once virtual nodes have been introduced. But I hope the general idea has been understood.

like image 23
eskalera Avatar answered Oct 21 '22 13:10

eskalera