Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tries versus ternary search trees for autocomplete?

I have gone through tries and Ternary Search Trees, and I have some questions on them. I have Googled for the answers but Im not able to get a concrete answer for those. So, here are my questions.

  1. If tries are space inefficient and TSTs combine the best of BST and tries, does that mean tries are practically not used at all?

  2. Assuming that TSTs are used for autocompletion,..how would that work in the case of Google? I mean practically we dont have a fixed set of words etc,..so how would the tree for the TST get constructed?

like image 945
Pavan Dittakavi Avatar asked Jun 10 '12 16:06

Pavan Dittakavi


1 Answers

Tries and ternary search trees represent a time/space trade off. If your alphabet has k symbols in it, then each node in a trie holds k pointers plus one extra bit for whether or not the node encodes a word. Looking up a word of length L always takes time O(L). A ternary search tree stores three pointers per node, plus one character and one bit for whether or not the node encodes a word. Looking up a word of length L takes time O(L log k). (If you have a static ternary search tree, you can build the TST using weight-balanced trees, which improves the lookup time to O(L + log k) but makes insertions prohibitively expensive.)

For cases where each node in the trie has most of its children used, the Trie is substantially more space efficient and time efficient than th ternary search tree. If each node stores comparatively few child nodes, the ternary search tree is much more space efficient. Typically speaking, tries are much, much faster than ternary search trees because fewer pointer indirections are required.

So in sort, neither structure is strictly better than the other. It depends on what words are being stored.

To mix things up a bit, succinct tries are starting to be a viable alternative to both of the above approaches. They have space usage better than tries, though the lookup time tends to be much slower. Again, it depends on the application whether they will be better or worse than the other two options.

As for how to build them - both tries and ternary search trees support efficient insertion of a single word. They don't need to be built from a fixed set of words in advance.

Hope this helps!

like image 124
templatetypedef Avatar answered Oct 05 '22 07:10

templatetypedef