Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suffix Arrays vs Suffix Trees

I just wanna know, when a suffix tree is superior to an enhanced suffix array.

After reading Replacing suffix trees with enhanced suffix arrays i don't see a reason to use suffix trees anymore. Some methods can get complicated, but you can do everything with a suffix array, what you can do with a suffix tree and you need the same time complexity but less memory.

A survey even showed, that suffix arrays are faster, because they are cache friendlier, and don't produce as much cache misses, then suffix trees (so the cache can predict the array usage much better, then on the recursive tree structure).

So, does anyone know a reason to choose a suffix tree over a suffix array?

edit Ok, if you know more tell me, so far its:

  • Suffixarrays dont allow on-line construction
  • Some pattern matching algorithms run faster on Suffixtrees
  • (added) because of the on-line construction, you can save it on hd a and enlarge an existent suffixtree. If you use a SSD, it should be quiet fast aswell.
like image 848
Nicolas Avatar asked Jun 25 '12 17:06

Nicolas


People also ask

What is suffix trees and suffix arrays?

Suffix arrays can be constructed by performing a depth-first traversal of a suffix tree. The suffix array corresponds to the leaf-labels given in the order in which these are visited during the traversal, if edges are visited in the lexicographical order of their first character.

What is the disadvantage of a suffix array search?

The disadvantage of suffix trees over suffix arrays is that they generally require more space to store all the internal pointers in the tree. If we are indexing a huge text (or collection of texts), this extra space may be too expensive.

What is meant by suffix tree?

In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.

Is suffix array important?

Suffix array is an extremely useful data structure, it can be used for a wide range of problems. Following are some famous problems where Suffix array can be used.


1 Answers

There are some interesting thoughts on the subject in SO itself. You can also find more technical material available on line. There is another paper that might help you with your issues, claiming to be another efficient way to implement these structures.

I am not an expert in the issue, but it seems to me that suffix arrays may be somewhat slower, even though they are more space efficient. Nevertheless, I lack the practical experience to be more detailed about both of them.

like image 199
rlinden Avatar answered Sep 20 '22 11:09

rlinden