Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the main differences between the Knuth-Morris-Pratt and Boyer-Moore search algorithms?

What are the main differences between the Knuth-Morris-Pratt search algorithm and the Boyer-Moore search algorithm?

I know KMP searches for Y in X, trying to define a pattern in Y, and saves the pattern in a vector. I also know that BM works better for small words, like DNA (ACTG).

What are the main differences in how they work? Which one is faster? Which one is less computer-greedy? In which cases?

like image 818
ghaschel Avatar asked Sep 29 '12 20:09

ghaschel


People also ask

What is the purpose of Knuth Morris Pratt algorithm?

Knuth Morris Pratt (KMP) is an algorithm, which checks the characters from left to right. When a pattern has a sub-pattern appears more than one in the sub-pattern, it uses that property to improve the time complexity, also for in the worst case.

Which is the best string matching algorithm?

The Karp-Rabin Algorithm.

Why is the KMP algorithm better than the naive string matching algorithm?

The worst case complexity of the Naive algorithm is O(m(n-m+1)). The time complexity of KMP algorithm is O(n) in the worst case. KMP (Knuth Morris Pratt) Pattern Searching The Naive pattern searching algorithm doesn't work well in cases where we see many matching characters followed by a mismatching character.

Is Boyer Moore a pattern matching algorithm?

Unlike the previous pattern searching algorithms, the Boyer Moore algorithm starts matching from the last character of the pattern. In this post, we will discuss the bad character heuristic and the Good Suffix heuristic in the next post. Pattern P moves past the mismatched character.


1 Answers

In an rough explanation

Boyer-Moore's approach is to try to match the last character of the pattern instead of the first one with the assumption that if there's not match at the end no need to try to match at the beginning. This allows for "big jumps" therefore BM works better when the pattern and the text you are searching resemble "natural text" (i.e. English)

Knuth-Morris-Pratt searches for occurrences of a "word" W within a main "text string" S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters. (Source: Wiki)

This means KMP is better suited for small sets like DNA (ACTG)

like image 68
gtgaxiola Avatar answered Oct 02 '22 17:10

gtgaxiola