Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient (read time) string search method? (C#)

I find that my program is searching through lots of lengthy strings (20,000+) trying to find a particular unique phrase.

What is the most efficent method for doing this in C#?

Below is the current code which works like this:

  1. The search begins at startPos because the target area is somewhat removed from the start
  2. It loops through the string, at each step it checks if the substring from that point starts with the startMatchString, which is an indicator that the start of the target string has been found. (The length of the target string varys).
  3. From here it creates a new substring (chopping off the 11 characters that mark the start of the target string) and searches for the endMatchString

I already know that this is a horribly complex and possibly very inefficent algorithm. What is a better way to accomplish the same result?

string result = string.Empty;    
for (int i = startPos; i <= response.Length - 1; i++)
{
   if (response.Substring(i).StartsWith(startMatchString))
   {
       string result = response.Substring(i).Substring(11);
       for (int j = 0; j <= result.Length - 1; j++)
       {
           if (result.Substring(j).StartsWith(endMatchString))
           {
               return result.Remove(j)
           }
       }
   }
}
return result;
like image 872
Andrew Harry Avatar asked Nov 27 '22 19:11

Andrew Harry


1 Answers

You can use String.IndexOf, but make sure you use StringComparison.Ordinal or it may be one order of magnitude slower.

private string Search2(int startPos, string startMatchString, string endMatchString, string response) {
    int startMarch = response.IndexOf(startMatchString, startPos, StringComparison.Ordinal);
    if (startMarch != -1) {
        startMarch += startMatchString.Length;
        int endMatch = response.IndexOf(endMatchString, startMarch, StringComparison.Ordinal);
        if (endMatch != -1) { return response.Substring(startMarch, endMatch - startMarch); }
    }
    return string.Empty;
}

Searching 1000 times a string at about the 40% of a 183 KB file took about 270 milliseconds. Without StringComparison.Ordinal it took about 2000 milliseconds.
Searching 1 time with your method took over 60 seconds as it creates a new string (O(n)) each iteration, making your method O(n^2).

like image 119
ggf31416 Avatar answered Dec 10 '22 07:12

ggf31416