Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Contains is faster than StartsWith?

A consultant came by yesterday and somehow the topic of strings came up. He mentioned that he had noticed that for strings less than a certain length, Contains is actually faster than StartsWith. I had to see it with my own two eyes, so I wrote a little app and sure enough, Contains is faster!

How is this possible?

DateTime start = DateTime.MinValue; DateTime end = DateTime.MinValue; string str = "Hello there";  start = DateTime.Now; for (int i = 0; i < 10000000; i++) {     str.Contains("H"); } end = DateTime.Now; Console.WriteLine("{0}ms using Contains", end.Subtract(start).Milliseconds);  start = DateTime.Now; for (int i = 0; i < 10000000; i++) {     str.StartsWith("H"); } end = DateTime.Now; Console.WriteLine("{0}ms using StartsWith", end.Subtract(start).Milliseconds); 

Outputs:

726ms using Contains  865ms using StartsWith 

I've tried it with longer strings too!

like image 749
hackerhasid Avatar asked Jun 25 '10 17:06

hackerhasid


People also ask

Which is faster indexOf or contains?

NET 4.0 - IndexOf no longer uses Ordinal Comparison and so Contains can be faster.

Which is faster indexOf or contains Java?

Even if I switch the order of indexOf and contains, indexOf is still faster. In the benchmark i linked, the passed value is also alreday a string!

Is startsWith () case sensitive?

startsWith method is case sensitive. However there is a String. startsWithIgnoreCase method which as the name notes is case insensitive.

What is startsWith?

The startsWith() method returns true if a string starts with a specified string. Otherwise it returns false . The startsWith() method is case sensitive. See also the endsWith() method.


2 Answers

Try using StopWatch to measure the speed instead of DateTime checking.

Stopwatch vs. using System.DateTime.Now for timing events

I think the key is the following the important parts bolded:

Contains:

This method performs an ordinal (case-sensitive and culture-insensitive) comparison.

StartsWith:

This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture.

I think the key is the ordinal comparison which amounts to:

An ordinal sort compares strings based on the numeric value of each Char object in the string. An ordinal comparison is automatically case-sensitive because the lowercase and uppercase versions of a character have different code points. However, if case is not important in your application, you can specify an ordinal comparison that ignores case. This is equivalent to converting the string to uppercase using the invariant culture and then performing an ordinal comparison on the result.

References:

http://msdn.microsoft.com/en-us/library/system.string.aspx

http://msdn.microsoft.com/en-us/library/dy85x1sa.aspx

http://msdn.microsoft.com/en-us/library/baketfxw.aspx

Using Reflector you can see the code for the two:

public bool Contains(string value) {     return (this.IndexOf(value, StringComparison.Ordinal) >= 0); }  public bool StartsWith(string value, bool ignoreCase, CultureInfo culture) {     if (value == null)     {         throw new ArgumentNullException("value");     }     if (this == value)     {         return true;     }     CultureInfo info = (culture == null) ? CultureInfo.CurrentCulture : culture;     return info.CompareInfo.IsPrefix(this, value,         ignoreCase ? CompareOptions.IgnoreCase : CompareOptions.None); } 
like image 118
Kelsey Avatar answered Oct 11 '22 17:10

Kelsey


I figured it out. It's because StartsWith is culture-sensitive, while Contains is not. That inherently means StartsWith has to do more work.

FWIW, here are my results on Mono with the below (corrected) benchmark:

1988.7906ms using Contains 10174.1019ms using StartsWith 

I'd be glad to see people's results on MS, but my main point is that correctly done (and assuming similar optimizations), I think StartsWith has to be slower:

using System; using System.Diagnostics;  public class ContainsStartsWith {     public static void Main()     {         string str = "Hello there";          Stopwatch s = new Stopwatch();         s.Start();         for (int i = 0; i < 10000000; i++)         {             str.Contains("H");         }         s.Stop();         Console.WriteLine("{0}ms using Contains", s.Elapsed.TotalMilliseconds);          s.Reset();         s.Start();         for (int i = 0; i < 10000000; i++)         {             str.StartsWith("H");         }         s.Stop();         Console.WriteLine("{0}ms using StartsWith", s.Elapsed.TotalMilliseconds);      } } 
like image 26
Matthew Flaschen Avatar answered Oct 11 '22 18:10

Matthew Flaschen