Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Graph database or relational database for tree structure data

I have company holding data with hierarchical structure by year. For example company A holds 50% of B, B holds 50% of C and D holds 50% of C. Each firm has their properties such as industry.

There are few write operations and mostly read. Specifically, starting from a set of nodes (root), extract the family tree by tracing down with certain percentage share threshold. There are several metrics of interest in the family tree.

For each node:

  1. the depth from the root
  2. the product of share layer by layer from the root, e.g. A holds 0.5*0.5 = 25% of C.

For each level:

  1. the distribution of share from each root
  2. the distribution of industry

Note that there could be multiple roots for each node and we are interested in all.

For now, the data is stored in a relational database and the task described above is done through joining. Would a graph database such as neo4j be more suitable for the data and this task? The crux of the problem is to have a proper index so that joining is not necessary for each time. Any suggestion and pointer would be greatly appreciated.

like image 672
cccfran Avatar asked Mar 03 '26 05:03

cccfran


1 Answers

Just about any graph database can model the information you are describing. How you go about constructing the queries to get what you want will be different in each product.

In InfiniteGraph we can model the information using the following schema:

UPDATE SCHEMA {
    CREATE CLASS Company {
        name        : String,
        industry    : String, 
        
        owns        : LIST {
                        element: Reference {
                            edgeClass       : Owns,
                            edgeAttribute   : owns
                        },
                        CollectionTypeName  : SegmentedArray
                    },
        ownedBy     : LIST {
                        element: Reference {
                            edgeClass       : Owns,
                            edgeAttribute   : ownedBy
                        },
                        CollectionTypeName  : SegmentedArray
                    }
        
    }
    
    CREATE CLASS Owns
    {
        percentage  : Real { Storage: B32 },
        owns        : Reference {referenced: Company, inverse: ownedBy },
        ownedBy     : Reference {referenced: Company,  inverse: owns }
    }
};

Then we can load the data you referred to in your question:

LET coA = CREATE Company { name: "A", industry: "Manufacturing" };
LET coB = CREATE Company { name: "B", industry: "Manufacturing" };
LET coC = CREATE Company { name: "C", industry: "Retail" };
LET coD = CREATE Company { name: "D", industry: "Construction" };

CREATE Owns { owns: $coB, ownedBy: $coA, percentage: 50.00 };
CREATE Owns { owns: $coC, ownedBy: $coB, percentage: 50.00 };
CREATE Owns { owns: $coC, ownedBy: $coD, percentage: 50.00 };

Finally, we can define a weight calculator operator that effectively multiplies the edge weights along a path together. Here we represent the weight of each edge as 1/percentage and then at the end we flip the sum over again and this gives us the value you're looking for.

CREATE WEIGHT CALCULATOR wcOwnership {
    minimum:    0,
    default:    0, 
    edges: {
        (:Company)-[ow:Owns]->(:Company): 1/ow.percentage
    }
};

The "edges" section defines the edge patterns to match on and the computation to be performed to compute the edge weight for that edge. In InfiniteGraph, the edge weight does not have to be an attribute; it can be a simple attribute or the result of complex computation based on the contents of one or many objects.

On the given data, we can use the weight calculator to query from the target company (C) up the hierarchy and for each root discovered, we can display the target (C), the percentage of ownership, the length of the path, and the name of the root company. This particular query only goes 1 to 10 degrees ([*1..10]) but this number can be expanded as necessary.

  DO> Match m = max weight 1000.0 wcOwnership 
                    ((cTarget:Company {name == 'C'})-[*1..10]->(cRoot:Company)) 
                     return cTarget.name, 
                            1/Weight(m) as PercentageOwnership, 
                            Length(m), 
                            cRoot.name;

{
  _Projection
  {
    cTarget.name:'C',
    PercentageOwnership:50.0000,
    Length(m):1,
    cRoot.name:'B'
  },
  _Projection
  {
    cTarget.name:'C',
    PercentageOwnership:50.0000,
    Length(m):1,
    cRoot.name:'D'
  },
  _Projection
  {
    cTarget.name:'C',
    PercentageOwnership:25.0000,
    Length(m):2,
    cRoot.name:'A'
  }
}  

This model will capture all of the root nodes per company in question.

#InfiniteGraph

like image 126
djhallx Avatar answered Mar 05 '26 23:03

djhallx