Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest (built-in) comparison for string-types in C#

What is the fastest built-in comparison-method for string-types in C#? I don't mind about the typographical/semantical meaning: the aim is to use the comparator in sorted lists in order to search fast in large collections. I think there are only two methods: Compare and CompareOrdinal. What's the fastest?

Additionally, is there is a faster method for those string-comparisons?

like image 565
Willem Van Onsem Avatar asked Sep 20 '09 21:09

Willem Van Onsem


People also ask

How do I compare strings in C?

We compare the strings by using the strcmp() function, i.e., strcmp(str1,str2). This function will compare both the strings str1 and str2. If the function returns 0 value means that both the strings are same, otherwise the strings are not equal.

Is integer comparison faster than string comparison?

Usually integers are faster. If the strings are really short (one character), they might be faster. This also depends on the processor you're using, and whether the data are in registers or memory (or what kind of memory, how it is accessed)… but it's safe to say that in most cases integers are faster.

What is the time complexity of comparing two strings?

String comparisons typically do a linear scan of the characters, returning false at the first index where characters do not match. The time complexity is O(N) and the actual time taken depends on how many characters need to be scanned before differences statistically emerge.


5 Answers

I'm assuming you want a less-than/equal/greater-than comparison rather than just equality; equality is a slightly different topic, although the principles are basically the same. If you're actually only searching for presence in something like a SortedList, I'd consider using a Dictionary<string, XXX> instead - do you really need all that sorting?

String.CompareOrdinal, or using an overload of String.Compare which allows the comparison to be provided, and specifying an ordinal (case-sensitive) comparison, e.g. String.Compare(x, y, StringComparison.Ordinal) will be the fastest.

Basically an ordinal comparison just needs to walk the two strings, character by character, until it finds a difference. If it doesn't find any differences, and the lengths are the same, the result is 0. If it doesn't find any differences but the lengths aren't the same, the longer string is deemed "larger". If it does find a difference, it can immediately work out which is deemed "larger" based on which character is "larger" in ordinal terms.

To put is another way: it's like doing the obvious comparison between two char[] values.

Culture-sensitive comparisons have to perform all kinds of tortuous feats, depending on the precise culture you use. For an example of this, see this question. It's pretty clear that having more complex rules to follow can make this slower.

like image 81
Jon Skeet Avatar answered Oct 01 '22 17:10

Jon Skeet


I just noticed a 50% performance increase in my own code by comparing string lengths first and if equal then using the string.compare methods. So in a loop I have:

VB:

If strA.length = strB.length then
   if string.compare(strA,strB,true) = 0 then
      TheyAreEqual
   End if
End if

C#:

if(strA.Length == strB.Length)
{
   if(string.Compare(strA,strB,true) == 0)
   {
       //they are equal
   }
}

This could be dependant on your own strings but its seems to have worked well for me.

like image 30
Peter Jamsmenson Avatar answered Sep 29 '22 17:09

Peter Jamsmenson


Fastest is interned strings with reference equality test, but you only get equality testing and it's at the heavy expense of memory - so expensive that it's almost never the recommended course.

Past that, a case-sensitive ordinal test will be the fastest, and this method is absolutely recommended for non-culture-specific strings. Case-sensitive is faster if it works for your use case.

When you specify either StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, the string comparison will be non-linguistic. That is, the features that are specific to the natural language are ignored when making comparison decisions. This means the decisions are based on simple byte comparisons and ignore casing or equivalence tables that are parameterized by culture. As a result, by explicitly setting the parameter to either the StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, your code often gains speed, increases correctness, and becomes more reliable.

Source

like image 25
Sam Harwell Avatar answered Oct 01 '22 17:10

Sam Harwell


I designed a unit test to test string comparison speed using some of the methods mentioned in this post. This test was ran using .NET 4

In short, there isn't much much difference, and I had to go to 100,000,000 iterations to see a significant difference. Since it seems the characters are compared in turn until a difference is found, inevitably how similar the strings are plays a part.

These results actually seem to suggest that using str1.Equals(str2) is the fastest way to compare strings.

These are the results of the test, with the test class included:

######## SET 1 compared strings are the same: 0
#### Basic == compare: 413
#### Equals compare: 355
#### Equals(compare2, StringComparison.Ordinal) compare: 387
#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: 426
#### String.CompareOrdinal(compare1, compare2) compare: 412

######## SET 2 compared strings are NOT the same: 0
#### Basic == compare: 710
#### Equals compare: 733
#### Equals(compare2, StringComparison.Ordinal) compare: 840
#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: 987
#### String.CompareOrdinal(compare1, compare2) compare: 776

using System;
using System.Diagnostics;
using NUnit.Framework;

namespace Fwr.UnitTests
{
    [TestFixture]
    public class StringTests
    {
        [Test]
        public void Test_fast_string_compare()
        {
            int iterations = 100000000;
            bool result = false;
            var stopWatch = new Stopwatch();

            Debug.WriteLine("######## SET 1 compared strings are the same: " + stopWatch.ElapsedMilliseconds);

            string compare1 = "xxxxxxxxxxxxxxxxxx";
            string compare2 = "xxxxxxxxxxxxxxxxxx";

            // Test 1

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1 == compare2;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Basic == compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 2

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 3

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2, StringComparison.Ordinal);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals(compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 4

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.Compare(compare1, compare2, StringComparison.Ordinal) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 5

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.CompareOrdinal(compare1, compare2) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.CompareOrdinal(compare1, compare2) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            Debug.WriteLine("######## SET 2 compared strings are NOT the same: " + stopWatch.ElapsedMilliseconds);

            compare1 = "ueoqwwnsdlkskjsowy";
            compare2 = "sakjdjsjahsdhsjdak";

            // Test 1

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1 == compare2;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Basic == compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 2

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 3

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2, StringComparison.Ordinal);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals(compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 4

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.Compare(compare1, compare2, StringComparison.Ordinal) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 5

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.CompareOrdinal(compare1, compare2) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.CompareOrdinal(compare1, compare2) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();
        }
    }
}
like image 28
gb2d Avatar answered Sep 29 '22 17:09

gb2d


This is quite an old question, but since I found it others might as well.

In researching this topic a bit further, I came upon an interesting blog post that compares all methods for string comparison. Probably not highly scientific but still a good housenumber.

Thanks to this article I started using string.CompareOrdinal in a scenario where I had to find out if one string was in a list of 170.000 other strings and doing this 1600 times in a row. string.CompareOrdinal made it almost 50% faster compared to string.Equals

like image 27
buddybubble Avatar answered Sep 28 '22 17:09

buddybubble