Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting Directed Acyclic Graph (DAG) to tree

I'm trying to implement algoritm to convert Directed Acyclic Graph to Tree (for fun, learining, kata, name it). So I come up with the data structure Node:

DAG to Tree

/// <summary>
/// Represeting a node in DAG or Tree
/// </summary>
/// <typeparam name="T">Value of the node</typeparam>
public class Node<T> 
{
    /// <summary>
    /// creats a node with no child nodes
    /// </summary>
    /// <param name="value">Value of the node</param>
    public Node(T value)
    {
        Value = value;
        ChildNodes = new List<Node<T>>();
    }

    /// <summary>
    /// Creates a node with given value and copy the collection of child nodes
    /// </summary>
    /// <param name="value">value of the node</param>
    /// <param name="childNodes">collection of child nodes</param>
    public Node(T value, IEnumerable<Node<T>> childNodes)
    {
        if (childNodes == null)
        {
            throw new ArgumentNullException("childNodes");
        }
        ChildNodes = new List<Node<T>>(childNodes);
        Value = value;
    }

    /// <summary>
    /// Determines if the node has any child node
    /// </summary>
    /// <returns>true if has any</returns>
    public bool HasChildNodes
    {
        get { return this.ChildNodes.Count != 0; }
    }


    /// <summary>
    /// Travearse the Graph recursively
    /// </summary>
    /// <param name="root">root node</param>
    /// <param name="visitor">visitor for each node</param>
    public void Traverse(Node<T> root, Action<Node<T>> visitor)
    {
        if (root == null)
        {
            throw new ArgumentNullException("root");
        }
        if (visitor == null)
        {
            throw new ArgumentNullException("visitor");
        }

        visitor(root); 
        foreach (var node in root.ChildNodes)
        {
            Traverse(node, visitor);
        }
    }

    /// <summary>
    /// Value of the node
    /// </summary>
    public T Value { get; private set; }

    /// <summary>
    /// List of all child nodes
    /// </summary>
    public List<Node<T>> ChildNodes { get; private set; }
}

It's pretty straightforward. Methods:

/// <summary>
/// Helper class for Node 
/// </summary>
/// <typeparam name="T">Value of a node</typeparam>
public static class NodeHelper
{
    /// <summary>
    /// Converts Directed Acyclic Graph to Tree data structure using recursion.
    /// </summary>
    /// <param name="root">root of DAG</param>
    /// <param name="seenNodes">keep track of child elements to find multiple connections (f.e. A connects with B and C and B also connects with C)</param>
    /// <returns>root node of the tree</returns>
    public static Node<T> DAG2TreeRec<T>(this Node<T> root, HashSet<Node<T>> seenNodes)
    {
        if (root == null)
        {
            throw new ArgumentNullException("root");
        }
        if (seenNodes == null)
        {
            throw new ArgumentNullException("seenNodes");
        }

        var length = root.ChildNodes.Count;
        for (int i = 0; i < length; ++i)
        {
            var node = root.ChildNodes[i];
            if (seenNodes.Contains(node))
            {
                var nodeClone = new Node<T>(node.Value, node.ChildNodes);
                node = nodeClone;
            }
            else
            {
                seenNodes.Add(node);
            }
            DAG2TreeRec(node, seenNodes);
        }
        return root;
    }
    /// <summary>
    /// Converts Directed Acyclic Graph to Tree data structure using explicite stack.
    /// </summary>
    /// <param name="root">root of DAG</param>
    /// <param name="seenNodes">keep track of child elements to find multiple connections (f.e. A connects with B and C and B also connects with C)</param>
    /// <returns>root node of the tree</returns>
    public static Node<T> DAG2Tree<T>(this Node<T> root, HashSet<Node<T>> seenNodes)
    {
        if (root == null)
        {
            throw new ArgumentNullException("root");
        }
        if (seenNodes == null)
        {
            throw new ArgumentNullException("seenNodes");
        }

        var stack = new Stack<Node<T>>();
        stack.Push(root);

        while (stack.Count > 0) 
        {
            var tempNode = stack.Pop();
            var length = tempNode.ChildNodes.Count;
            for (int i = 0; i < length; ++i)
            {
                var node = tempNode.ChildNodes[i];
                if (seenNodes.Contains(node))
                {
                    var nodeClone = new Node<T>(node.Value, node.ChildNodes);
                    node = nodeClone;
                }
                else
                {
                    seenNodes.Add(node);
                }
               stack.Push(node);
            }
        } 
        return root;
    }
}

and test:

    static void Main(string[] args)
    {
        // Jitter preheat
        Dag2TreeTest();
        Dag2TreeRecTest();

        Console.WriteLine("Running time ");
        Dag2TreeTest();
        Dag2TreeRecTest();

        Console.ReadKey();
    }

    public static void Dag2TreeTest()
    {
        HashSet<Node<int>> hashSet = new HashSet<Node<int>>();

        Node<int> root = BulidDummyDAG();

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        var treeNode = root.DAG2Tree<int>(hashSet);
        stopwatch.Stop();

        Console.WriteLine(string.Format("Dag 2 Tree = {0}ms",stopwatch.ElapsedMilliseconds));

    }

    private static Node<int> BulidDummyDAG()
    {
        Node<int> node2 = new Node<int>(2);
        Node<int> node4 = new Node<int>(4);
        Node<int> node3 = new Node<int>(3);
        Node<int> node5 = new Node<int>(5);
        Node<int> node6 = new Node<int>(6);
        Node<int> node7 = new Node<int>(7);
        Node<int> node8 = new Node<int>(8);
        Node<int> node9 = new Node<int>(9);
        Node<int> node10 = new Node<int>(10);
        Node<int> root  = new Node<int>(1);

        //making DAG                   
        root.ChildNodes.Add(node2);    
        root.ChildNodes.Add(node3);    
        node3.ChildNodes.Add(node2);   
        node3.ChildNodes.Add(node4);   
        root.ChildNodes.Add(node5);    
        node4.ChildNodes.Add(node6);   
        node4.ChildNodes.Add(node7);
        node5.ChildNodes.Add(node8);
        node2.ChildNodes.Add(node9);
        node9.ChildNodes.Add(node8);
        node9.ChildNodes.Add(node10);

        var length = 10000;
        Node<int> tempRoot = node10; 
        for (int i = 0; i < length; i++)
        {
            var nextChildNode = new Node<int>(11 + i);
            tempRoot.ChildNodes.Add(nextChildNode);
            tempRoot = nextChildNode;
        }

        return root;
    }

    public static void Dag2TreeRecTest()
    {
        HashSet<Node<int>> hashSet = new HashSet<Node<int>>();

        Node<int> root = BulidDummyDAG();

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        var treeNode = root.DAG2TreeRec<int>(hashSet);
        stopwatch.Stop();

        Console.WriteLine(string.Format("Dag 2 Tree Rec = {0}ms",stopwatch.ElapsedMilliseconds));
    }

What is more, data structure need some improvment:

  • Overriding GetHash, toString, Equals, == operator
  • implementing IComparable
  • LinkedList is probably a better choice

Also, before the conversion there are certian thigs that need to be checked:

  • Multigraphs
  • If it's DAG (Cycles)
  • Diamnods in DAG
  • Multiple roots in DAG

All in all, it narrows down to a few questions: How can I improve the conversion? Since this is a recurion it's possible to blow up the stack. I can add stack to memorize it. If I do continuation-passing style, will I be more efficient?

I feel that immutable data structure in this case would be better. Is it correct?

Is Childs the right name ? :)

like image 340
Lukasz Madon Avatar asked Jun 18 '11 17:06

Lukasz Madon


People also ask

How do you convert a directed graph to a tree?

There is an edge from i to arr[i]. The task is to convert this directed graph into tree by changing some of the edges. If for some i, arr[i] = i then i represents the root of the tree. In case of multiple answers print any of them.

Can a DAG be a tree?

A tree is a connected undirected acyclic graph. If the underlying graph of a DAG is a tree, then the graph is a polytree. A leaf is a vertex of degree one. Every tree with at least two vertices has at least two leaves.

Can a directed acyclic graph be a tree?

A polytree (or directed tree or oriented tree or singly connected network) is a directed acyclic graph (DAG) whose underlying undirected graph is a tree. In other words, if we replace its directed edges with undirected edges, we obtain an undirected graph that is both connected and acyclic.

Is DAG and tree same?

Trees have direction (parent / child relationships) and don't contain cycles. They fit with in the category of Directed Acyclic Graphs (or a DAG). So Trees are DAGs with the restriction that a child can only have one parent. One thing that is important to point out, Trees aren't a recursive data structure.


2 Answers

Algorithm:

  • As you observed, some nodes appear twice in the output. If the node 2 had children, the whole subtree would appear twice. If you want each node to appear just once, replace

    if (hashSet.Contains(node))
    {
        var nodeClone = new Node<T>(node.Value, node.Childs);
        node = nodeClone;
    }
    

    with

    if (hashSet.Contains(node))
    {
        // node already seen -> do nothing
    }
    
  • I wouldn't be too worried about the size of the stack or performance of recursion. However, you could replace your Depth-first-search with Breadth-first-search which would result in nodes closer to the root being visited earlier, thus yielding a more "natural" tree (in your picture you already numbered the nodes in BFS order).

     var seenNodes = new HashSet<Node>();
     var q = new Queue<Node>();
     q.Enqueue(root);
     seenNodes.Add(root);   
    
     while (q.Count > 0) {
         var node = q.Dequeue();
         foreach (var child in node.Childs) {
             if (!seenNodes.Contains(child )) {
                 seenNodes.Add(child);
                 q.Enqueue(child);
         }
     }
    

    The algorithm handles diamonds and cycles.

  • Multiple roots

    Just declare a class Graph which will contain all the vertices

    class Graph
    {
        public List<Node> Nodes { get; private set; }
        public Graph()
        {
            Nodes = new List<Node>();
        }    
    }
    

Code:

  • the hashSet could be named seenNodes.

  • Instead of

    var length = root.Childs.Count;
    for (int i = 0; i < length; ++i)
    {
        var node = root.Childs[i];
    

    write

    foreach (var child in root.Childs)
    
  • In Traverse, the visitor is quite unnecessary. You could rather have a method which yields all the nodes of the tree (in the same order traverse does) and it is up to user to do whatever with the nodes:

    foreach(var node in root.TraverseRecursive())
    {
        Console.WriteLine(node.Value);
    }
    
  • If you override GetHashCode and Equals, the algorithm will no more be able to distinguish between two different Nodes with same value, which is probably not what you want.

  • I don't see any reason why LinkedList would be better here than List, except for the reallocations (Capacity 2,4,8,16,...) which List does when adding nodes.

like image 153
Martin Konicek Avatar answered Sep 22 '22 09:09

Martin Konicek


  1. you had better posted in CodeReview
  2. Childs is wrong => Children
  3. you don't have to use a HashSet, you could have easily used a List>, because checking references only is enough here. (and so no GetHashCode, Equals and operators overriding is needed)

  4. easeier way is Serializing your class and then Deserializing it again into second objectwith XmlSerializer. while Serialized and Deserialized, 1 object referenced 2 times will become 2 objects with different references.

like image 35
Maziar Taheri Avatar answered Sep 22 '22 09:09

Maziar Taheri