Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

KMP v.s. Suffix Tree for sub-string match

Tags:

algorithm

Wondering if anyone could provide some advice about pros and cons, between choosing KMP and suffix tree, if we want to see if a string is a sub-string of another string? Thanks.

thanks in advance, Lin

like image 524
Lin Ma Avatar asked Dec 03 '15 07:12

Lin Ma


People also ask

Which is the most efficient string matching algorithm?

The Knuth-Morris-Pratt Algorithm.

Is KMP matcher pattern matching algorithm?

The KMP matching algorithm uses degenerating property (pattern having same sub-patterns appearing more than once in the pattern) of the pattern and improves the worst case complexity to O(n).

What is suffix tree good for?

Suffix trees can be used to solve a large number of string problems that occur in text-editing, free-text search, computational biology and other application areas.

What is KMP string matching algorithm?

The KMP algorithm was the first-ever string matching algorithm that ran in linear time. Most of the naive string matching algorithms run in O(nm) time, while the KMP algorithm runs in O(m + n) time where n is the length of the string, and m is the length of the pattern.


1 Answers

Runtime and memory complexity is about the same. You prepare the pattern in O(N) and you can search in O(M) (n, m lenght of your strings).

Suffix trees can do a few more operations that may not be necessary for your application.

In KMP you prepare a search pattern and then you can look for it in may strings easily.

In Suffix trees you prepare the text to search then you can look for many patterns in it easily. Even though the memory usage is linear the constant is large so this will need more memory.

KMP is in general easier to code than suffix trees.

like image 175
Sorin Avatar answered Sep 19 '22 20:09

Sorin