Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are There Any Good C++ Suffix Trie Libraries? [closed]

Tags:

Does anyone know of a really rock solid C++ library for suffix tries? Other than the one in Mummer?
Ideally, I'd like:
Some concept of concurrency.
Good caching behavior.
Permissive license.
Support for arbitrary alphabets.

like image 948
Jake Kurzer Avatar asked May 25 '11 10:05

Jake Kurzer


People also ask

Is suffix tree same as trie?

A suffix tree is a tree data structure typically used to store a list of strings. It is also referred to as the compressed version of a trie, as, unlike a trie, each unique suffix in the list is compressed together and represented by a single node or branch in a suffix tree.

How do you make a suffix tree?

We build a suffix tree by following each suffix and creating an edge for each character, starting with a top node. If the new suffix to be put in the tree begins with a set of characters that are already in the tree, we follow those characters until we have a different one, creating a new branch.

What is suffix trees and suffix arrays?

Suffix trees and suffix arrays are data structures for representing texts that allow substring queries like "where does this pattern appear in the text" or "how many times does this pattern occur in the text" to be answered quickly.

What is the time complexity of searching for a pattern of length M characters in a trie?

The time taken by this algorithm is O(n^2) where 'n' is the length of the text. This time is essentially taken to build the trie. Note that this is one time activity and subsequent searches of another pattern in this text would take O(m) time where m is the length of the pattern.


1 Answers

Being a bioinformatician, my pick would be SeqAn (check out the sequence index section). It implements a lazy suffix tree and an enhanced suffix array (an equivalent data structure), both of which have good cache behaviour.

like image 52
mhyfritz Avatar answered Sep 27 '22 19:09

mhyfritz