Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why dont we use 2-3 or 2-3-4-5 trees?

I have a basic understanding of how 2-3-4 trees maintain the height balance property operation after operation to make sure even the worst case operations are O(n logn).

But I do not understand it well enough to know why only 2-3-4?

Why not 2-3 or 2-3-4-5 etc?

like image 846
Lazer Avatar asked Jan 06 '12 22:01

Lazer


1 Answers

Implementation of 2-3-4 trees typically requires either multiple classes (2NODE, 3NODE, 4NODE) or you have just NODE that has an array of items. In the case of multiple classes you waste lots of time constructing and destructing node instances and reparenting them is cumbersome. If you use a single class with arrays to hold items and children then you are either resizing arrays constantly which is similarly wasteful or you wind up wasting over half your memory on unused array elements. It's just not very efficient compared to Red-Black trees.

Red-Black trees have only one type of node structure. Since Red-Black trees have a duality with 2-3-4 trees, RB trees can use the exact same algorithms as 2-3-4 trees (no need for the stupidly confusing/complex implementations described in Cormen, Leiserson and Rivest that led to AA trees which are not less complex than the 2-3-4 algorithm.)

So, Red-Black trees for their ease of implementation plus their memory/CPU efficiency. (AVL trees are nice too. They produce more well balanced trees and are stupid simply to code but they tend to be less efficient due to working too often to maintain only a slightly more compact tree.)

Oh, and 2-3-4-5-6... etc aren't done because nothing is gained. 2-3-4 has a net-gain over 2-3 trees because they can be done without recursion easily (recursion tends to be less efficient, especially when it cannot be coded tail-recursively). However, B-Trees and Bplus-Trees are pretty much 2-3-4-5-6-7-8-9-etc trees where the max size of the nodes, n, is chosen so that n records can be stored in a single disk sector. (i.e. each disk sector is a node in the tree and the size of the sector is equivalent to the number of items stored in the node.) This is because the time to search through 512 records linearly in memory is still MUCH faster than traversing down a level in the tree which requires another disk seek/fetch. and O(512) is still O(1) and thus maintains O(lg n) for the tree.

like image 180
Jeff Wiegley Avatar answered Sep 29 '22 18:09

Jeff Wiegley