Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is easier to implement: 2-3-4 Tree or Red-Black Tree?

The textbook I'm learning from (Lafore) presents Red-Black Trees first, and does not include any pseudo-code, although the associated algorithms as presented seems fairly complex, with many unique cases.

Next he presents 2-3-4 Trees which seem to me much simpler to understand, and I would guess, to implement. He includes some actual Java code which is very clear. He seems to imply that 2-3-4 is easier to implement, and I would agree based on what I've seen so far.

Wikipedia, however, says the opposite... I think it is incorrect perhaps:

http://en.wikipedia.org/wiki/2-3-4_tree

2-3-4 trees are an isometry of red-black trees, meaning that they are equivalent data structures. In other words, for every 2-3-4 tree, there exists at least one red-black tree with data elements in the same order. Moreover, insertion and deletion operations on 2-3-4 trees that cause node expansions, splits and merges are equivalent to the color-flipping and rotations in red-black trees. Introductions to red-black trees usually introduce 2-3-4 trees first, because they are conceptually simpler. 2-3-4 trees, however, can be difficult to implement in most programming languages because of the large number of special cases involved in operations on the tree. Red-black trees are simpler to implement, so tend to be used instead.

Specifically, the part about the "larger number of special cases" seems quite backward to me (I think it is the RB which have the large number of special cases, not the 2-3-4). But, I'm still learning (and still honestly have not quite fully wrapped my head around the Red-Black Trees), so I'd love to hear other opinions.

As a sidenote... while I do agree with most of what Lafore says, I think it's interesting he presented them in the opposite order compared to what Wikipedia says is common (2-3-4 before RB). I do think 2-3-4 first would make more sense since the RB is so much harder to conceptualize. Perhaps he chose that order because the RB was more closely related to BST's which he had just finished in the previous chapter.

like image 910
The111 Avatar asked Oct 10 '22 03:10

The111


1 Answers

the part about the "larger number of special cases" seems quite backward to me (I think it is the RB which have the large number of special cases, not the 2-3-4)

RB Trees can be implemented in a dozen or so lines, if you have pattern matching in your langugage:

data Color = R | B
data Tree a = E | T Color (Tree a) a (Tree a)

balance :: Color -> Tree a -> a -> Tree a -> Tree a
balance B (T R (T R a x b) y c          ) z d                               = T R (T B a x b) y (T B c z d)
balance B (T R a           x (T R b y c)) z d                               = T R (T B a x b) y (T B c z d)
balance B a                               x (T R (T R b y c) z d          ) = T R (T B a x b) y (T B c z d)
balance B a                               x (T R b           y (T R c z d)) = T R (T B a x b) y (T B c z d)
balance col a x b = T col a x b

insert :: Ord a => a -> Tree a -> Tree a
insert x s = T B a y b where
  ins E          =  T R E x E
  ins s@(T col a y b) 
    | x < y      =  balance col (ins a) y b
    | x > y      =  balance col a y (ins b)
    | otherwise  =  s
  T _ a y b = ins s

This famous definition from Okasaki's paper

like image 75
Don Stewart Avatar answered Oct 13 '22 09:10

Don Stewart