Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing Composite Patterns (Hierarchical Data) in Database

What are 'best-practices' for saving Composite patterns in a Relational Database?

We have been using Modified Preorder Tree Traversal. This is very quick to build the whole tree, but very slow to insert or delete new nodes (all left and right values need to be adjusted). Also querying the children of a node is not easy and very slow.

Another thing we noticed is that you really have to make sure the tree doesn't get messy. You need transaction locks, otherwise the left and right values can get corrupt, and fixing a corrupt left right tree is not an easy job.

It does work very good however, the Modified Preorder Tree Traversal, but I was wondering if there are better alternatives.

like image 717
Lieven Cardoen Avatar asked Mar 29 '09 16:03

Lieven Cardoen


People also ask

Which database is best for hierarchical data?

Document based database like MongoDB, and Redis are great for small scale, hierarchical data with a relatively small amount of children for each entry.

What type of problems is the composite pattern a good fit for?

Composite design pattern is a structural pattern which modifies the structure of an object. This pattern is most suitable in cases where you need to work with objects which form a tree like hierarchy.

Which design pattern should you use when you want to represent part-whole hierarchies of objects?

Composite pattern is a partitioning design pattern and describes a group of objects that is treated the same way as a single instance of the same type of object. The intent of a composite is to “compose” objects into tree structures to represent part-whole hierarchies.


1 Answers

While finding all descendents of a row with MPTT is fast, finding all children can be slow. However you should be able to fix that by adding a parent_id field to your table that records (yes, redundantly) the parent of the row. Then the search becomes:

SELECT *
FROM tbl
WHERE parent_id = z

Yes, parent_id contains redundant information, potentially denormalizing your table -- but since any insert/update/delete already requires global changes, keeping parent_id up-to-date isn't much extra to pay. You could alternatively use a level field that records the vertical level of the row, although that is in fact more likely to change under certain types of transformations (e.g. moving a subtree to a different point in the tree).

The plain old link-to-parent representation (i.e. just having parent_id and no left_pos or right_pos), is of course faster for insert/update-heavy workloads, but the only queries it can answer efficiently are "Find the parent of X" and "Find the children of X." Most workloads involve much more reading than writing, so usually MPTT is faster overall -- but perhaps in your case you need to consider moving ("back") to link-to-parent?

like image 140
j_random_hacker Avatar answered Sep 29 '22 13:09

j_random_hacker