What is the theory behind KMP pattern matching algorithm? [closed]

2 Answers

The KMP matching algorithm is based on finite automata and works by implicitly building the transition table for an automaton that matches the string. Using a very clever linear-time preprocessing of the string to search for, a matching automaton can be constructed, and the automaton can then be simulated on the string to search in in linear time. The net result is a linear-time algorithm for string matching.

The automaton that's constructed works by having one state for each amount of the string that has been matched so far. By default, the transitions are such that matching the next character advances to the next state in the machine and reading an invalid character transitions back to the beginning. However, certain pieces of the string to search for might share some overlapping structure, so some new transitions are added that take the automaton back to an earlier (but not the first) state when a character is read.

This algorithm is generalized by the Aho-Corasick algorithm, which builds a more complex automaton in order to search for multiple strings at once.

I have an implementation of this algorithm on my personal site that contains an extensive discussion of the actual details of how the algorithm works, including a correctness proof, more formal intuition behind the algorithm, and explanation of how to derive the algorithm from first principles. It took me a while to put together, but I hope that it might be a good resource to learn more about the algorithm.

Hope this helps!

150

answered Sep 28 '22 15:09

templatetypedef

Morris discovered the algorithm from first principles, but Knuth independently learned about a theorem due to Stephen A. Cook that deterministic two-way pushdown automata could be simulated in linear-time and extracted an early version of "KMP" by specializing the simulation to a string matching automaton. Pratt contributed an efficiency improvement. See Knuth's retelling.

answered Sep 28 '22 14:09

Per

Related questions
                            
                                Classic string manipulation interview questions? [closed]
                            
                                non-technical benefits of having string-type immutable
                            
                                How to declare an empty string in XAML ResourceDictionary
                            
                                realloc(): invalid next size when reallocating to make space for strcat on char * [duplicate]
                            
                                std::move between std::string and std::vector<unsigned char>
                            
                                How to calculate actual memory used by string variable?
                            
                                How to convert numpy object array into str/unicode array?
                            
                                Interned strings not in permgen?
                            
                                Call functions with special prefix/suffix
                            
                                How to convert Char to String in Julia?
                            
                                Regular expression for strings with even number of a's and odd no of b's
                            
                                Display special characters using System.out.println
                            
                                How to use backspace escape sequence in Notepad++?
                            
                                Should I implement Display or ToString to render a type as a string?
                            
                                How can I hash a string to an int using c++?
                            
                                Ruby string to operator
                            
                                Is there any benefit to returning the result of assigning a value to a local variable rather than the value directly?
                            
                                add leading zeros to a list of numbers in Python
                            
                                Can you do HtmlDecode & HtmlEncode in Silverlight?
                            
                                Why can't I store string keys in an Associative Array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the theory behind KMP pattern matching algorithm? [closed]

Tags:

string

algorithm

pattern-matching

knuth-morris-pratt

user366312

People also ask

2 Answers

templatetypedef

Per

Recent Activity

Donate For Us