Wondering if anyone could provide some advice about pros and cons, between choosing KMP and suffix tree, if we want to see if a string is a sub-string of another string? Thanks.
thanks in advance, Lin
The Knuth-Morris-Pratt Algorithm.
The KMP matching algorithm uses degenerating property (pattern having same sub-patterns appearing more than once in the pattern) of the pattern and improves the worst case complexity to O(n).
Suffix trees can be used to solve a large number of string problems that occur in text-editing, free-text search, computational biology and other application areas.
The KMP algorithm was the first-ever string matching algorithm that ran in linear time. Most of the naive string matching algorithms run in O(nm) time, while the KMP algorithm runs in O(m + n) time where n is the length of the string, and m is the length of the pattern.
Runtime and memory complexity is about the same. You prepare the pattern in O(N) and you can search in O(M) (n, m lenght of your strings).
Suffix trees can do a few more operations that may not be necessary for your application.
In KMP you prepare a search pattern and then you can look for it in may strings easily.
In Suffix trees you prepare the text to search then you can look for many patterns in it easily. Even though the memory usage is linear the constant is large so this will need more memory.
KMP is in general easier to code than suffix trees.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With