Boyer-Moore Practical in C#?

Tags:

Boyer-Moore is probably the fastest non-indexed text-search algorithm known. So I'm implementing it in C# for my Black Belt Coder website.

I had it working and it showed roughly the expected performance improvements compared to String.IndexOf(). However, when I added the StringComparison.Ordinal argument to IndexOf, it started outperforming my Boyer-Moore implementation. Sometimes, by a considerable amount.

I wonder if anyone can help me figure out why. I understand why StringComparision.Ordinal might speed things up, but how could it be faster than Boyer-Moore? Is it because of the the overhead of the .NET platform itself, perhaps because array indexes must be validated to ensure they're in range, or something else altogether. Are some algorithms just not practical in C#.NET?

Below is the key code.

// Base for search classes abstract class SearchBase {     public const int InvalidIndex = -1;     protected string _pattern;     public SearchBase(string pattern) { _pattern = pattern; }     public abstract int Search(string text, int startIndex);     public int Search(string text) { return Search(text, 0); } }  /// <summary> /// A simplified Boyer-Moore implementation. ///  /// Note: Uses a single skip array, which uses more memory than needed and /// may not be large enough. Will be replaced with multi-stage table. /// </summary> class BoyerMoore2 : SearchBase {     private byte[] _skipArray;      public BoyerMoore2(string pattern)         : base(pattern)     {         // TODO: To be replaced with multi-stage table         _skipArray = new byte[0x10000];          for (int i = 0; i < _skipArray.Length; i++)             _skipArray[i] = (byte)_pattern.Length;         for (int i = 0; i < _pattern.Length - 1; i++)             _skipArray[_pattern[i]] = (byte)(_pattern.Length - i - 1);     }      public override int Search(string text, int startIndex)     {         int i = startIndex;          // Loop while there's still room for search term         while (i <= (text.Length - _pattern.Length))         {             // Look if we have a match at this position             int j = _pattern.Length - 1;             while (j >= 0 && _pattern[j] == text[i + j])                 j--;              if (j < 0)             {                 // Match found                 return i;             }              // Advance to next comparision             i += Math.Max(_skipArray[text[i + j]] - _pattern.Length + 1 + j, 1);         }         // No match found         return InvalidIndex;     } }

EDIT: I've posted all my test code and conclusions on the matter at http://www.blackbeltcoder.com/Articles/algorithms/fast-text-search-with-boyer-moore.

306

asked Feb 05 '11 02:02

Jonathan Wood

1 Answers

Based on my own tests and the comments made here, I've concluded that the reason String.IndexOf() performs so well with StringComparision.Ordinal is because the method calls into unmanaged code that likely employs hand-optimized assembly language.

I have run a number of different tests and String.IndexOf() just seems to be faster than anything I can implement using managed C# code.

If anyone's interested, I've written everything I've discovered about this and posted several variations of the Boyer-Moore algorithm in C# at http://www.blackbeltcoder.com/Articles/algorithms/fast-text-search-with-boyer-moore.

135

answered Sep 30 '22 00:09

Jonathan Wood

Related questions
                            
                                How to get the value of built, encoded ViewState?
                            
                                No authenticationScheme was specified, and there was no DefaultChallengeScheme found Cookies Authentication
                            
                                Why collections classes in C# (like ArrayList) inherit from multiple interfaces if one of these interfaces inherits from the remaining?
                            
                                Enum addition vs subtraction and casting
                            
                                Parallel.Foreach + yield return?
                            
                                C# The type or namespace name `List' could not be found. But I'm importing System.Collections.Generic;
                            
                                How can I programmatically manipulate Windows desktop icon locations?
                            
                                .NET functions disassembled
                            
                                Mixing C# with Objective-C
                            
                                SignalR OnDisconnected - a reliable way to handle "User is Online" for chatroom?
                            
                                Pass ILogger or ILoggerFactory to constructors in AspNet Core?
                            
                                Dynamic LINQ - Is There A .NET 4 Version?
                            
                                Entity Framework VS pure Ado.Net
                            
                                Nuget restore fails on Azure Devops with message "unable to load the service index for source"
                            
                                Developing for ASP.NET-MVC without Visual Studio
                            
                                .Net whole application as a single .exe file?
                            
                                Is Explicit Transaction Rollback Necessary?
                            
                                ASP.NET 5 Identity - custom SignInManager
                            
                                Azure Custom Controller / API .Net backend
                            
                                How do I get intellisense in app.config for a custom section?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Boyer-Moore Practical in C#?

Tags:

performance

c#

.net

algorithm

boyer-moore

Jonathan Wood

People also ask

1 Answers

Jonathan Wood

Recent Activity

Donate For Us