Theoretically, what data structure can I use for trees with shared memory?

Real world problem

I have a forest of trees. Like 20,000 trees. This forest occupies too much memory. But these trees are similar - you could find groups of trees (for ~200 trees) so that they have a common subtree of quite a significant size (tens of %).

Theory

So knowing that:

Trees are similar i.e. they share a common connected subgraph including the root (not necessarily including the leaves - but possibly).

Does there exist any data structure that allows for efficient storing of that information? Once the structure is created, I'm only interested in reading.

It doesn't necessarily be a solution tight to .NET, I could code it from scratch, I just need the idea :D But of course, if there is some little-known structure in .NET that kind of achieves that, I would be pleased to know.

I have a feeling that this shared memory stuff may have something to do with immutable structures that by definition are expected to share memory...

My trees are not binary search trees, unfortunately. They can have any amount of children.

Reading

As for reading, it is quite simple. I am always navigating from the root to a leaf. As you would in any JSON or XML, given an exact path to a value.

Nature of similarity

The connected subgraph including the root that is same (potentially) among two trees always contains the root and spans down. In some cases it is possible to even reach the leaves. See an example (the yellow part is the connected subgraph including the root):

enter image description here

Given these rules, mathematically speaking all the trees are similar - the connected subgraph is either empty, or it contains only the root, or inductively - it contains the root and its children...

953

asked May 06 '16 06:05

Patryk Golebiowski

1 Answers

You can group children of your tree node by different "owners". When you add a node, you specify owner (or null to use default "shared" owner). When you traverse your tree, you also specify owner. Here is a sketch code:

class TreeNode {
    protected static readonly object SharedOwner = new object();
}

class TreeNode<T> : TreeNode {        
    private readonly T _data;
    private readonly Dictionary<object, List<TreeNode<T>>> _children;

    public TreeNode(T data) {
        this._data = data;
        _children = new Dictionary<object, List<TreeNode<T>>>();
    }

    public TreeNode<T> AddChild(T data, object owner = null) {
        if (owner == null)
            owner = SharedOwner;
        if (!_children.ContainsKey(owner))
            _children.Add(owner, new List<TreeNode<T>>());
        var added = new TreeNode<T>(data);
        _children[owner].Add(added);
        return added;
    }

    public void Traverse(Action<T> visitor, object owner = null) {
        TraverseRecursive(this, visitor, owner);
    }

    private void TraverseRecursive(TreeNode<T> node, Action<T> visitor, object owner = null) {
        visitor(node._data);
        // first traverse "shared" owner's nodes
        if (node._children.ContainsKey(SharedOwner)) {
            foreach (var sharedNode in node._children[SharedOwner]) {
                TraverseRecursive(sharedNode, visitor, owner);
            }
        }
        // then real owner's nodes
        if (owner != null && owner != SharedOwner && node._children.ContainsKey(owner)) {
            foreach (var localNode in node._children[owner]) {
                TraverseRecursive(localNode, visitor, owner);
            }
        }
    }
}

And a sample usage:

class Program {
    static void Main(string[] args) {
        // this is shared part
        var shared = new TreeNode<string>("1");
        var leaf1 = shared.AddChild("1.1").AddChild("1.1.1");
        var leaf2 = shared.AddChild("1.2").AddChild("1.2.1");
        var firstOwner = new object();
        var secondOwner = new object();
        // here we branch first time
        leaf1.AddChild("1.1.1.1", firstOwner);
        leaf2.AddChild("1.2.1.1", firstOwner);
        // and here another branch
        leaf1.AddChild("1.1.1.2", secondOwner);
        leaf2.AddChild("1.2.1.2", secondOwner);
        shared.Traverse(Console.WriteLine, firstOwner);
        shared.Traverse(Console.WriteLine, secondOwner);
        Console.ReadKey();
    }        
}

answered Oct 29 '22 09:10

Evk

Related questions
                            
                                Performance difference between .NET method (in .dll) and reflected version
                            
                                How to use a custom WCF proxy in a Powershell script with a config file?
                            
                                WPF: Custom tooltip arrow placement
                            
                                .NET generate generic method - compact framework
                            
                                SSL 256 bit encryption - on Azure website
                            
                                Programmatically determine if system has switchable graphics
                            
                                ServiceStack throws StackOverflowException when receiving large data
                            
                                Passing F# function to IEnumerable.Where vs IEnumerable.All
                            
                                How to make my custom TaskScheduler the default one
                            
                                Good approach for hundreds of comsumers and big files [closed]
                            
                                Why do I get different results using a virtual or non-virtual property?
                            
                                NuGet Dependency Framework Targeting not working when packaging using the .csproj
                            
                                WebApi Application: Error 404.0, Handler StaticFile
                            
                                Saving .NET user settings takes very long time
                            
                                Is it possible to Remove a VB.NET EventHandler that your class didn't Add?
                            
                                Should I create a new CacheItemPolicy for every item I add to a System.Runtime.Caching.ObjectCache?
                            
                                Referencing components in separate .JSX file
                            
                                Cannot connect to Microsoft band from Windows Runtime Component through Bluetooth
                            
                                What could be rate limiting CPU cycles on my C# WCF Service?
                            
                                VB.NET Iterator Function Loses Local Variables

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Theoretically, what data structure can I use for trees with shared memory?

Tags:

.net

algorithm

data-structures

tree