Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database schema for hierarchical groups

I'm working on a database design for groups hierarchy used as the foundation of a larger system. Each group can contain other groups, and also 'devices' as leaf objects (nothing goes below device).

The database being used is MS SQL 2005. (Though working in MS SQL 2000 would be a bonus; a solution requiring MS SQL 2008 is unfortunately not feasible at this time).

There are different types of groups, and these need to be dynamic and definable at run-time by users. For example, group types might be "customer", "account", "city", or "building", "floor", and each type is going to have a different set of attributes, definable by the user. There will also be business rules applied - eg, a "floor" can only be contained underneath a "building" group, and again, these are definable at runtime.

A lot of the application functionality comes from running reports based on these groups, so there needs to be a relatively fast way to get a list of all devices contained within a certain group (and all sub-groups).

Storing groups using modified pre-order tree traversal technique has the upside that it is fast, but the downside that it is fairly complex and fragile - if external users/applications modify the database, there is the potential for complete breakage. We're also implementing an ORM layer, and this method seems to complicate using relations in most ORM libraries.

Using common table expressions and a "standard" id/parentid groups relation seem to be a powerful way to avoid running multiple recursive queries. Is there any downside to this method?

As far as attributes, what is the best way to store them? A long, narrow table that relates back to group? Should a common attribute, like "name" be stored in a groups table, instead of the attributes table (a lot of the time, the name will be all that is required to display)?

Are there going to be performance issues using this method (let's assume a high average of 2000 groups with average of 6 attributes each, and average 10 concurrent users, on a reasonable piece of hardware, eg, quad-core Xeon 2 Ghz, 4GB ram, discounting any other processes)?

Feel free to suggest a completely different schema than what I've outlined here. I was just trying to illustrate the issues I'm concerned about.

like image 236
gregmac Avatar asked Sep 22 '08 02:09

gregmac


People also ask

Which database is best for hierarchical data?

Document based database like MongoDB, and Redis are great for small scale, hierarchical data with a relatively small amount of children for each entry.

What is a hierarchical data model database?

A hierarchical database model is a data model in which the data are organized into a tree-like structure. The data are stored as records which are connected to one another through links. A record is a collection of fields, with each field containing only one value.

Can SQL be used with hierarchical database?

Use hierarchyid as a data type to create tables with a hierarchical structure, or to describe the hierarchical structure of data that is stored in another location. Use the hierarchyid functions in Transact-SQL to query and manage hierarchical data.


1 Answers

I'd recommend you actually construct the easiest-to-maintain way (the "standard" parent/child setup) and run at least some basic benchmarks on it.

You'd be surprised what a database engine can do with the proper indexing, especially if your dataset can fit into memory.

Assuming 6 attributes per group, 2000 groups, and 30 bytes/attribute, you're talking 360KB*expected items/group -- figure 400KB. If you expect to have 1000 items/group, you're only looking at 400MB of data -- that'll fit in memory without a problem, and databases are fast at joins when all the data is in memory.

like image 58
Jonathan Rupp Avatar answered Sep 23 '22 09:09

Jonathan Rupp