Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

My alernative to nested sets for arbitrary-depth hierarchical data sets: Good or Bad?

While recreating my CMS, I wanted an alternative to the traditional parent/child approach for managing the sitemap / page hierarchy. I had remembered seeing the nested set model a while back, but couldn't remember what it was called. So, I stumbled upon a similar approach that I want to evaluate and compare the properties, making sure I won't run into dumb limitations later on because I didn't go with what is already time-tested. So, please advise if A) it's already been invented (what's it called?!), B) there are fundamental flaws in the properties, or C) it's a good approach (please give good justification!).

Consider this list:

  • Home
    • About Us
    • Contact Us
    • Products
      • Clothing
      • Books
      • Electronics
    • Knowledge Base
    • Other stuff

Under the nested set model, I believe you store the left/right descriptors for each node with a depth-first traversal:

Home                  1-18
    About Us          2-3
    Contact Us        4-5
    Products          6-13
        Clothing      7-8
        Books         9-10
        Electronics  11-12
    Knowledge Base   14-15
    Other stuff      16-17

And here's my "wrong way" that I'm starting to like better:

Home                  1-9
    About Us          2-2
    Contact Us        3-3
    Products          4-7
        Clothing      5-5
        Books         6-6
        Electronics   7-7
    Knowledge Base    8-8
    Other stuff       9-9

Rather than a left/right pair, I'm storing ID and LAST_CONTAINED_ID. I found that many of the properties are the same (or very similar):

  • The root node is ID = 1
  • For "leaves," both properties are equal, while with branches, they are not
  • The total number of "subnodes" for any given node is LAST_CONTAINED_ID - ID
  • All contained nodes have an ID > the container's ID, but <= the container's LAST_CONTAINED_ID
  • The ancestor nodes have an ID < the child ID, but also a LAST_CONTAINED_ID >= the child ID
  • The depth is the SUM of the ancestor nodes

In addition, the ID serves an order-specific, unique identifier (with no gaps!). I've found it easier also to store DEPTH and PARENT references for simplicity, but that's pretty much the same for nested sets too from what I understand.

So, does this count as a nested set? And is it already a common approach (but why hadn't I heard of it before...)? Is there a good reason why I should use a true nested set over this?

I welcome your thoughts.

like image 302
landons Avatar asked Nov 15 '11 12:11

landons


1 Answers

The only advantage it gives is the 'no gaps' feature, but to achieve that you've had to change the logic applied to right-values. In the original model, you get the children of 'Products' by seeing all those values 6 < .. < 13, but in your model, you get those children by seeing values 4 < .. <= 7. Having to treat right-values different to left-values makes it slightly less elegant.
Another minor gripe is that in the original, the jump from 12 to 14 highlights that you've changed level, whereas in your model you don't get such visual cues.
So if you're happy using (<, <=) in place of (<, <) then it works. (Since it appears to be equivalent, I can't say 'good' or 'bad', but you've already highlighted the dangers of implementing the path less travelled.)

like image 106
niczero Avatar answered Oct 10 '22 08:10

niczero