Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database Structure for Tree Data Structure [closed]

You mention the most commonly implemented, which is Adjacency List: https://blogs.msdn.microsoft.com/mvpawardprogram/2012/06/25/hierarchies-convert-adjacency-list-to-nested-sets

There are other models as well, including materialized path and nested sets: http://communities.bmc.com/communities/docs/DOC-9902

Joe Celko has written a book on this subject, which is a good reference from a general SQL perspective (it is mentioned in the nested set article link above).

Also, Itzik Ben-Gann has a good overview of the most common options in his book "Inside Microsoft SQL Server 2005: T-SQL Querying".

The main things to consider when choosing a model are:

1) Frequency of structure change - how frequently does the actual structure of the tree change. Some models provide better structure update characteristics. It is important to separate structure changes from other data changes however. For example, you may want to model a company's organizational chart. Some people will model this as an adjacency list, using the employee ID to link an employee to their supervisor. This is usually a sub-optimal approach. An approach that often works better is to model the org structure separate from employees themselves, and maintain the employee as an attribute of the structure. This way, when an employee leaves the company, the organizational structure itself does not need to be changes, just the association with the employee that left.

2) Is the tree write-heavy or read-heavy - some structures work very well when reading the structure, but incur additional overhead when writing to the structure.

3) What types of information do you need to obtain from the structure - some structures excel at providing certain kinds of information about the structure. Examples include finding a node and all its children, finding a node and all its parents, finding the count of child nodes meeting certain conditions, etc. You need to know what information will be needed from the structure to determine the structure that will best fit your needs.


Have a look at Managing Hierarchical Data in MySQL. It discusses two approaches for storing and managing hierarchical (tree-like) data in a relational database.

The first approach is the adjacency list model, which is what you essentially describe: having a foreign key that refers to the table itself. While this approach is simple, it can be very inefficient for certain queries, like building the whole tree.

The second approach discussed in the article is the nested set model. This approach is far more efficient and flexible. Refer to the article for detailed explanation and example queries.


If you have to use Relational DataBase to organize tree data structure then Postgresql has cool ltree module that provides data type for representing labels of data stored in a hierarchical tree-like structure. You can get the idea from there.(For more information see: http://www.postgresql.org/docs/9.0/static/ltree.html)

In common LDAP is used to organize records in hierarchical structure.


Having a table with a foreign key to itself does make sense to me.

You can then use a common table expression in SQL or the connect by prior statement in Oracle to build your tree.