Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialization / Derialization of a tree structure

I'm trying to figure out the best way to save (serialize) and later open (deserialize) a tree structure. My structure is made up of various object types with different properties, but each inherits from a base abstract "Node" class.

Each node has unique ID (GUID), and has an AddSuperNode(Node nd) method that will set the parent of a node. This in turn calls other methods that allow the parent node to know what sub nodes it has. However, some nodes also utilize a AddAuxSuperNode() method that adds a secondary parent to the Node.

I was using binary serialization, but now I think I want to use something where I have a bit more control, and the serialized data is more accessible. I also want to retain Type information when I deserialize, and be able to serialize private values. So DataContractSerializer seemed like the best way to go.

I can't just serialize the root Node directly because of nodes having multiple parents. I do not want to create duplicate objects. So it would seem that I need to deconstruct the tree into a flat list, and then serialize that. Then after serializing that list reconstruct the tree. Does this sound right?

Like I said before each Node has a unique GUID identifier, but right now Nodes reference their parents/children directly and do not store their ids. I could update the AddSuperNode() and AddAuxSuperNode() methods to also update a list of parent ids to be serialized in addition to the direct references. But I'd rather only update/create this list when the object is being serialized. So i was thinking create an UpdateSuperNodeIDRefs() method in the node that would be called right before serialization.

The following is what I'm planning to do for serialization and deserialization of this structure. Can anyone suggestion a better/cleaner/more efficient way to do this?

Serialization

1) Provide the root node of the tree structure

2) Break down tree structure into a flat Dictionary(Guid id,Node nd) where id is the guid of nd.

3) Call UpdateSuperNodeIDRefs(); for each node to update the IDs it has saved for its parents.

4) Serialize the Dictionary of nodes with DataContractSerializer

Deserialization

1) Deserialize the Dictionary of nodes

2) Itterate through each Node in the Dictionary, reconnecting each to their parents. For any Parent IDs stored find the respective Node(s) in the Dictionary with matching ID(s) call the AddSuperNode() or AddAuxSuperNode() to re-connnect the node to its parent(s)

3) From any Node in the Dictionary find the root of the structure

4) Return the root Node

like image 431
Eric Anastas Avatar asked Apr 10 '09 02:04

Eric Anastas


People also ask

How do you serialize and deserialize a binary tree?

Problem Statement Serialize a binary tree to a file and then deserialize it back to a tree so that the original and the deserialized trees are identical. Serialize: write the tree in a file. Deserialize: read from a file and reconstruct the tree in memory. Consider the below tree as the input tree.

What is serialization and deserialization?

Serialization is a mechanism of converting the state of an object into a byte stream. Deserialization is the reverse process where the byte stream is used to recreate the actual Java object in memory. This mechanism is used to persist the object. The byte stream created is platform independent.

What does it mean to serialize a binary tree?

Serialize and Deserialize Binary Tree. Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Which methods are used during serialization and deserialization process?

For serializing the object, we call the writeObject() method of ObjectOutputStream class, and for deserialization we call the readObject() method of ObjectInputStream class. We must have to implement the Serializable interface for serializing the object.


1 Answers

If a node has multiple parents, then it isn't a tree; it is, presumably, a graph. However - worry not; DataContractSerializer can handle this for you:

using System;
using System.IO;
using System.Runtime.Serialization;

[DataContract]
class Node {
    [DataMember]
    public Node AnotherNode { get; set; }
}

static class Program
{
    static void Main()
    {
        Node a = new Node(), b = new Node();
        // make it a cyclic graph, to prove reference-mode
        a.AnotherNode = b;
        b.AnotherNode = a;

        // the preserveObjectReferences argument is the interesting one here...
        DataContractSerializer dcs = new DataContractSerializer(
            typeof(Node), null, int.MaxValue, false, true, null);
        using (MemoryStream ms = new MemoryStream())
        {
            dcs.WriteObject(ms, a);
            ms.Position = 0;
            Node c = (Node) dcs.ReadObject(ms);
            // so .AnotherNode.Another node should be back to "c"
            Console.WriteLine(ReferenceEquals(c, c.AnotherNode.AnotherNode));
        }

    }
}
like image 172
Marc Gravell Avatar answered Sep 27 '22 22:09

Marc Gravell