Does anyone know of an R package that solves the longest common substring problem? I am looking for something fast that could work on vectors.
The longest common substrings of a set of strings can be found by building a generalized suffix tree for the strings, and then finding the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
Check out the "Rlibstree" package on omegahat Github
This uses http://www.icir.org/christian/libstree/.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With