I'm going through the data structures chapter in The Algorithm Design Manual and came across Suffix Trees.
The example states:
Input:
XYZXYZ$
YZXYZ$
ZXYZ$
XYZ$
YZ$
Z$
$
Output:
I'm not able to understand how that tree gets generated from the given input string. Suffix trees are used to find a given Substring in a given String, but how does the given tree help towards that? I do understand another given example of a trie shown below, but if the below trie gets compacted to a suffix tree, then what would it look like?
The standard efficient algorithms for constructing a suffix tree are definitely nontrivial. The main algorithm for doing so is called Ukkonen's algorithm and is a modification of the naive algorithm with two extra optmizations. You are probably best off reading this earlier question for details on how to build it.
You can construct suffix trees by using the standard insertion algorithms on radix tries to insert each suffix into the tree, but doing so wlil take time O(n2), which can be expensive for large strings.
As for doing fast substring searching, remember that a suffix tree is a compressed trie of all the suffixes of the original string (plus some special end-of-string marker). If a string S is a substring of the initial string T and you had a trie of all the suffixes of T, then you could just do a search to see if T is a prefix of any of the strings in that trie. If so, then T must be a substring of S, since all its characters exist in sequence somewhere in T. The suffix tree substring search algorithm is precisely this search applied to the compressed trie, where you follow the appropriate edges at each step.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With