Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whats faster in Oracle? Small table with tree structure vs. Huge flat table

I'm designing an application that will use Oracle, and we have this department hierarchy that we need to map in our database. Some thing that looks like this (I'm pretty sure you all know what I'm talking about but I'll include a piece of the ERD just in case):

alt text

So it will have data stored that looks like this:

[1 | 0]
[2 | 1]
[3 | 2]
[4 | 2]

In other words:

Department 1
     |__Department 2
             |___Department 3
             |___Department 4

And so on...

This will improve the number of records required on the table and the Data can be accessed using a CONNECT BY command, just having 1 recor per department. We usually go for this tree structure as solution, but in this new application performance is a critical, so I was wondering what if I have a flattened-out table that would look like this.

[1 | 0]
[2 | 1]
[3 | 1]
[3 | 2]
[4 | 1]
[4 | 2]

This allows you to have very obvious relationships without having to know the parent Department for a given child to know who their upper hierarchy Departments are. But this increases the amount of data required since you need a record for each level a Department is in, meaning that if a have a Department 15 levels below the top one we would require 15 records for it. The Department is pretty big so this may end up being a huge table (about 2 million records).

Ok, so after the brief introduction, this is the question; Has someone actually tried this that could tell me what is faster/less expensive for the DB between this two options, the huge flat table or the small tree one?

like image 892
Chepech Avatar asked Jan 21 '23 07:01

Chepech


2 Answers

I would definitely go for the first option (hierarchical approach). I think it's better to model the data correctly than to just use a bad data model to gain performance. Since you are modeling a hierarchy here, it makes sense to store it that way in the DB.

If you want the best of both worlds, my recommendation would be to look at using a materialized view to "flatten" the hierarchical data, then you are still storing the data properly, but you get the performance gains (if any) by using the materialized view.

There's almost always a way to follow a good data model and still find ways to get good performance. But a bad data model will cost you for years to come, and it takes great pain to correct it later.

However, even with the flattened approach, you have to consider that you are increasing the number of records dramatically, especially as you get to the leaf nodes in the tree, so I'd be surprised if having a flat hierarchy table (your second approach) would improve performance since there are many more records to process.

like image 185
dcp Avatar answered Feb 05 '23 17:02

dcp


Alternative for fast access to hierarchical data is Nested Set data model:

Nested Set on Wiki

It allows you to have single-pass access to all children nodes, regardless of the depth, however might require offline maintenance, depending on your impementation.

like image 42
dovka Avatar answered Feb 05 '23 17:02

dovka