When would you use KMP over BOYER-MOORE

Tags:

I am currently learning about pattern matching algorithms and have come across these two algorithms. I have the following general ideas:

KMP

Compares text left-to-right
Uses a failure array to shift intelligently
takes O(m), where m is the length of the pattern, to compute failure array
takes O(m), space
takes O(n), time to search a string

Compares pattern from last character
Uses bad character jumps and good suffix jumps
takes O(m + size of alphabet) to compute tables
takes O(m + size of alphabet), space
takes O(n), but usually better to search

I came across the following question which triggered this question(True or False):

The Knuth-Morris-Pratt (KMP) algorithm is a good choice if we want to search for the same pattern repeatedly in many different texts.

So I believe the answer is true just because the assumption is that every time you run the algorithm on different text the preprocessing is only O(n) where for BM it is O(n + size of alphabet). However, I am not sure if I am making the correct assumption that every time the algorithm is rerun a new table is recomputed. Because say the text always falls in the alphabet of english. I would only need to compute the table once and just reuse the table. So at the end of the day, would the answer to this question be dependent on the fact that the algorithms are all being run on text which is contained in the same alphabet or is there some other factor which may affect it?

436

asked Apr 18 '13 14:04

Eric

1 Answers

In theory, both algorithms will have "similar" performance; KMP will do about 2n comparisons in the searching phase and Boyer-Moore will do about 3n comparisons in the searching phase in the worst case. In neither case do you need to repeat the preprocessing when you get a new text.

But the real answer is that you shouldn't use either one in practice.

The linear auxiliary storage needed by both algorithms leads to considerably...rougher performance on modern architectures because of all of the extra memory accesses.

However, the ideas behind Boyer-Moore and KMP underpin most fast string matching algorithms. Something like KMP's "failure function" idea is used by every practically effective string matching algorithm I know of; it turns out that you can compute a suboptimal "failure function" for a pattern on-the-fly that still gives you linear time matching while only needing constant additional space. Boyer-Moore is faster than linear in the "average case" of matching a fixed pattern against random noise, and this bears itself out in many practical situations.

answered Oct 05 '22 21:10

tmyklebu

Related questions
                            
                                In Order Successor in Binary Search Tree
                            
                                How to detect how similar a speech recording is to another speech recording?
                            
                                Find a number where it appears exactly N/2 times
                            
                                How to sort a stack using only stack operations?
                            
                                Join a string using delimiters
                            
                                How to check if a point lies on a line between 2 other points
                            
                                Is there anything like "std::and" or "std::or"?
                            
                                Missing integer variation - O(n) solution needed [closed]
                            
                                Find the most common entry in an array
                            
                                finding if two words are anagrams of each other
                            
                                What is a good example of recursion other than generating a Fibonacci sequence?
                            
                                What is the optimal blind algorithm for the game, 2048? [closed]
                            
                                Recommended Open Source C# algorithms & data structures libraries [closed]
                            
                                Remove Kinect depth shadow
                            
                                Find the longest word given a collection
                            
                                Understanding Ukkonen's algorithm for suffix trees [duplicate]
                            
                                Tickmark algorithm for a graph axis
                            
                                Use of indexes for multi-word queries in full-text search (e.g. web search)
                            
                                Best Fit Scheduling Algorithm
                            
                                Android: high quality image resizing / scaling

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When would you use KMP over BOYER-MOORE

Tags:

algorithm

pattern-matching

knuth-morris-pratt

boyer-moore

Eric

People also ask

1 Answers

tmyklebu

Recent Activity

Donate For Us