Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suffix tree and Tries. What is the difference?

I am reading about Tries commonly known as Prefix trees and Suffix Trees.
Although I have found code for a Trie I can not find an example for a Suffix Tree. Also I get the feeling that the code that builds a Trie is the same as the one for a Suffix Tree with the only difference that in the former case we store prefixes but in the latter suffixes.
Is this true? Can anyone help me clear this out in my head? An example code would be great help!

like image 338
Cratylus Avatar asked Dec 15 '12 16:12

Cratylus


People also ask

Is suffix tree and trie same?

If you imagine a Trie in which you put some word's suffixes, you would be able to query it for the string's substrings very easily. This is the main idea behind suffix tree, it's basically a "suffix trie".

What is the difference between trees and tries?

A tree is a general structure of recursive nodes. There are many types of trees. Popular ones are binary tree and balanced tree. A Trie is a kind of tree, known by many names including prefix tree, digital search tree, and retrieval tree (hence the name 'trie').

What is trie and explain about suffix tree?

In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.

What is suffix trie used for?

Trie. In a trie, each alphabet of all the strings in the prescribed string is parsed one by one and represented by a single node. If two or more words start with the same sub-string, the identical sub-string is represented by the same chain of nodes.


2 Answers

A suffix tree can be viewed as a data structure built on top of a trie where, instead of just adding the string itself into the trie, you would also add every possible suffix of that string. As an example, if you wanted to index the string banana in a suffix tree, you would build a trie with the following strings:

banana anana nana ana na a 

Once that's done you can search for any n-gram and see if it is present in your indexed string. In other words, the n-gram search is a prefix search of all possible suffixes of your string.

This is the simplest and slowest way to build a suffix tree. It turns out that there are many fancier variants on this data structure that improve on either or both space and build time. I'm not well versed enough in this domain to give an overview but you can start by looking into suffix arrays or this class advanced data structures (lecture 16 and 18).

This answer also does a wonderfull job explaining a variant of this data-structure.

like image 176
Ze Blob Avatar answered Sep 21 '22 23:09

Ze Blob


If you imagine a Trie in which you put some word's suffixes, you would be able to query it for the string's substrings very easily. This is the main idea behind suffix tree, it's basically a "suffix trie".

But using this naive approach, constructing this tree for a string of size n would be O(n^2) and take a lot of memory.

Since all the entries of this tree are suffixes of the same string, they share a lot of information, so there are optimized algorithms that allows you to create them more efficiently. Ukkonen's algorithm, for example, allows you to create a suffix tree online in O(n) time complexity.

like image 36
Juan Lopes Avatar answered Sep 20 '22 23:09

Juan Lopes