How does Trie and B+ tree compare for indexing lexicographically sorted strings [on the order some billions]? It should support range queries as well.
From perf. as well as implementation complexity point of view.
A tree is a general structure of recursive nodes. There are many types of trees. Popular ones are binary tree and balanced tree. A Trie is a kind of tree, known by many names including prefix tree, digital search tree, and retrieval tree (hence the name 'trie').
Unlike this, the B+ tree can store more keys than the B-tree of the same level because of their feature of storing pointers only at the leaf nodes. This contributes to storing more entries at fewer levels in the B+ tree and lessens the searching time for a key.
AVL tree is a binary tree while B-tree is a multi-way tree (N-ary tree) i.e. Any node in AVL tree can have at max two child nodes and one piece of information/data while any node in a B-tree can have n nodes and n-1 piece of information/data. For B-tree, n is also known as its order.
B-Tree : B-Tree is known as a self-balancing tree as its nodes are sorted in the inorder traversal. Unlike the binary trees, in B-tree, a node can have more than two children. B-tree has a height of logM N (Where 'M' is the order of tree and N is the number of nodes).
I would say it depends on what you mean by Range.
If your range is expressed as All words beginning by, then a Trie
is the right choice I'd say. On the other hand, Trie
are not meant for requests like All words between XX and ZZ.
Note that the branching factor of the B+ Tree
affects its performance (the number of intermediary nodes). If h
is the height of the tree, then nmax ~~ bh. Therefore h ~~ log(nmax) / log(b).
With n = 1 000 000 000
and b = 100
, we have h ~~ 5
. Therefore it means only 5 pointer dereferencing for going from the root to the leaf. It's more cache-friendly than a Trie
.
Finally, B+ Tree
is admittedly more difficult to implement than a Trie
: it's more on a Red-Black Tree
level of complexity.
Depends on your actual task:
N
children from a substree, then a Trie is the best choice because you simply visit less nodes than in a B+ Tree scenario. If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With