Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String comparison in dotnet framework 4

I will explain my problem(excuse my bad English), I have a .NET exe in which every milliseconds of processing is very important.

This program does lots of string comparison (most of it is string1.IndexOf(string2, StringComparison.OrdinalIgnoreCase)).

When i switch to framework 4, my program time is twice than before.

I searched for explanation and I found that the function IndexOf(s, OrdinalIgnoreCase) is much slower in framework 4 (I did test with a simple console application and in a loop the time was 30ms in 3.5 and 210ms in 4.0 ???). But the comparison in current culture is quicker in framework 4 than 3.5.

Here it's a sample of code I use :

int iMax = 100000;
String str  = "Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+fr;+rv:1.9.0.1)+Gecko/2008070208+Firefox/3.0.1";
Stopwatch sw = new Stopwatch();
sw.Start();
StringComparison s = StringComparison.OrdinalIgnoreCase;
for(int i = 1;i<iMax;i++)
{
    str.IndexOf("windows", s);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.Read();

My questions are :

  1. Has anyone noticed the same problem?

  2. Someone have an explanation on this change?

  3. Is there a solution to bypass the problem?

Thanks.

like image 785
baz Avatar asked Sep 22 '10 15:09

baz


People also ask

Can you use == to compare strings in C#?

Both, == and Equals() method compares the content of strings.

Can we use == for string comparison?

You should not use == (equality operator) to compare these strings because they compare the reference of the string, i.e. whether they are the same object or not. On the other hand, equals() method compares whether the value of the strings is equal, and not the object itself.

How do you compare 2 strings?

The compare() function compares two strings and returns the following values according to the matching cases: Returns 0, if both the strings are the same. Returns <0, if the value of the character of the first string is smaller as compared to the second string input.

How do I check if a string is greater than another C#?

The C# Compare() method is used to compare first string with second string lexicographically. It returns an integer value. If both strings are equal, it returns 0. If first string is greater than second string, it returns 1 else it returns -1.


1 Answers

Ok i have a response of one of my question.

With reflector i can see the difference between framework 2 and 4 and that explain my perforamnce issue.

    public int IndexOf(string value, int startIndex, int count, StringComparison comparisonType)
{
    if (value == null)
    {
        throw new ArgumentNullException("value");
    }
    if ((startIndex < 0) || (startIndex > this.Length))
    {
        throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_Index"));
    }
    if ((count < 0) || (startIndex > (this.Length - count)))
    {
        throw new ArgumentOutOfRangeException("count", Environment.GetResourceString("ArgumentOutOfRange_Count"));
    }
    switch (comparisonType)
    {
        case StringComparison.CurrentCulture:
            return CultureInfo.CurrentCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.None);

        case StringComparison.CurrentCultureIgnoreCase:
            return CultureInfo.CurrentCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.IgnoreCase);

        case StringComparison.InvariantCulture:
            return CultureInfo.InvariantCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.None);

        case StringComparison.InvariantCultureIgnoreCase:
            return CultureInfo.InvariantCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.IgnoreCase);

        case StringComparison.Ordinal:
            return CultureInfo.InvariantCulture.CompareInfo.IndexOf(this, value, startIndex, count, CompareOptions.Ordinal);

        case StringComparison.OrdinalIgnoreCase:
            return TextInfo.IndexOfStringOrdinalIgnoreCase(this, value, startIndex, count);
    }
    throw new ArgumentException(Environment.GetResourceString("NotSupported_StringComparison"), "comparisonType");
}

This is the base code of function IndexOf of the 2 framework (no difference between 4 and 2)

But in the function TextInfo.IndexOfStringOrdinalIgnoreCase there are differences :

Framework 2 :

    internal static unsafe int IndexOfStringOrdinalIgnoreCase(string source, string value, int startIndex, int count)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    return nativeIndexOfStringOrdinalIgnoreCase(InvariantNativeTextInfo, source, value, startIndex, count);
}

Framework 4 :

    internal static int IndexOfStringOrdinalIgnoreCase(string source, string value, int startIndex, int count)
{
    if ((source.Length == 0) && (value.Length == 0))
    {
        return 0;
    }
    int num = startIndex + count;
    int num2 = num - value.Length;
    while (startIndex <= num2)
    {
        if (CompareOrdinalIgnoreCaseEx(source, startIndex, value, 0, value.Length, value.Length) == 0)
        {
            return startIndex;
        }
        startIndex++;
    }
    return -1;
}

The main algorithm has changed in framework 2 the call is a nativeDll that has been removed of framework 4. Its good to know

like image 151
baz Avatar answered Sep 22 '22 09:09

baz