Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What prevents Van Emde Boas trees from being more popular in real-world applications?

We know balanced trees perform insertion, deletion, and search in O(log n)-time, examples include

  • Red-Black
  • AVL
  • Splay
  • B-tree (and its variants).

However, when keys are integers in some limited range, it is possible to use a Van Emde Boas tree to drop these operations down to O(log(log n))-time, i.e., exponentially better than AVL or RB trees. Well, this is actually the case of many real world applications.

I see lots of applications for this. One I'd like to cite is on databases, for which creating indexes basically involves choosing between a Hash or a B*-tree. If a Van Emde Boas tree was implemented, it would provide a halfway between these two options, theoretically improving many query optimization problems.

Why Van Emde Boas tree is not widely used as say Red-Black or B-tree since

  • it's not a novelty (it was invented in 1975)
  • easy to implement
  • way faster than other trees

and what are the considerations about it?

like image 304
Cássio Jandir Pagnoncelli Avatar asked Jan 02 '14 09:01

Cássio Jandir Pagnoncelli


2 Answers

The asymptotic complexity is sometimes misleading. In the case for Van Emde Boas tree the constant is quite large see here. I quote:

However, for small trees the overhead associated with vEB trees
is enormous: on the order of 2^(m/2)

There are also other cases where an algorithm with better complexity exists but it only gets better for an input so big that in practice it is almost never used e.g. linear time static RMQ.

like image 50
Ivaylo Strandjev Avatar answered Oct 24 '22 10:10

Ivaylo Strandjev


One of the reason is that complexity is defined not on the size of the set you store but on the size of the universe of values. Another difference is that keys can't be arbitrary types for which you have comparison operation but must be integers. You should not see vEB as an alternative for BST but rather as an alternative for arrays. An array have O(1) store and look up costs for object keyed by integers. vEB offer O(log log M), where M is size of the universe of your values. Now, you see vEB is not better than regular array for look ups and store but it offers O(1) min, max operations and O(log log M) prev next key operations which array does not. It's worth to mention that the layout of vEB trees has a properties which make possible to create cache oblivious trees which are far more interesting developments of modern CS.

http://erikdemaine.org/papers/BRICS2002/paper.pdf

like image 34
Lambder Avatar answered Oct 24 '22 09:10

Lambder