Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this the most efficient way to search for a substring?

Tags:

performance

c#

I'm working with some code that returns a code to indicate the type of user that they are (e.g. "A", "B", "C", "D", etc.). Each code corresponds to a specific role and/or scope (e.g. across the entire application or just for the object being worked on).

In some code that I'm looking at, I see calls to check if the user's code is one of several in order to allow them to perform some action. So I see calls like:

//"B" would come from the database
string userCode = "B";

//some more work...

//if the user's code is either A or C...
if("AC".IndexOf(userCode) >= 0) {
  //do work that allows the user to progress
} else {
  //notify user they can't do this operation
}

Is this an efficient way of performing this check? Are there more efficient ways?

Thanks in advance!

like image 666
David Hoerster Avatar asked Dec 21 '22 09:12

David Hoerster


2 Answers

Looking at the de-compiled code for Contains(), it just calls IndexOf() with StringComparison.Ordinal, so I'd say IndexOf() is most efficient (by a very small hair) i if used in the same way (Ordinal) since it has one less method call, but Contains() is more readable and therefore more maintainable...

public bool Contains(string value)
{
    return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}

As in all things, I'd go with what's more readable and maintainable then splitting hairs on performance. Only do micro-optimization when you know there's a bottleneck at this point.

UPDATE: Over 1,000,000 iterations:

  • Contains(value) - took 130ms
  • IndexOf(value, StringComparison.Ordinal) - took 128 ms

So as you can see, very, very NEAR same. Once again, go with what's more maintainable.

UPDATE 2: If your code is always a single char (not a 1-char string), IndexOf() is faster:

  • Contains(char value) - took 94 ms
  • IndexOf(char value) - took 16 ms

If you know your char codes are always a single char, it is about an order of magnitude faster to use IndexOf() with a char argument.

This is because Contains(char value) is an extension method off of IEnumerable<T> and not a first class method of string.

But once again ~100 ms over 1,000,000 iterations is really, truly, quite negligible.

like image 64
James Michael Hare Avatar answered Dec 24 '22 00:12

James Michael Hare


If you're looking for a single character, and it is not case-sensitive, use the overload that works with a char. Searching for a single case-insensitive char is quicker than a sub-string.

"AC".IndexOf('C');

This would have to be ridiculously performance critical to matter though. What you are doing would be extremely fast with any obvious method.

Update - Timings

[Test]
public void Time()
{
    const string UserCode = "C";
    const char UserCodeChar = 'C';
    const int Iterations = 10000000;

    double adjust = 0;

    Func<Action, double> time = action =>
    {
        Stopwatch sw = Stopwatch.StartNew();
        for (int i = 0; i < Iterations; i++) action();
        return sw.Elapsed.TotalMilliseconds;
    };

    Action<string, Action> test = (desc, t) =>
    {
        double ms = time(t) - adjust;
        Console.WriteLine(desc + " time: {0}ms", ms);
    };

    adjust = time(() => { });

    test("IndexOfString", () => "AC".IndexOf(UserCode));
    test("IndexOfString", () => "AC".IndexOf(UserCode));

    test("ContainsString", () => "AC".Contains(UserCode));
    test("ContainsString", () => "AC".Contains(UserCode));

    test("IndexOfChar", () => "AC".IndexOf(UserCodeChar));
    test("IndexOfChar", () => "AC".IndexOf(UserCodeChar));
}

Result:

IndexOfString time: 1035.2984ms
IndexOfString time: 1026.2889ms
ContainsString time: 764.9274ms
ContainsString time: 736.7621ms
IndexOfChar time: 92.9008ms
IndexOfChar time: 92.9961ms

like image 30
Tim Lloyd Avatar answered Dec 24 '22 01:12

Tim Lloyd