Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ query is slow

During application profiling I found that function checking for pattern match is very slow. It is written using LINQ. Simple replacement of this LINQ expression with loop makes really huge difference. What is it? Is LINQ really such a bad thing and works so slow or I misunderstand something?

private static bool PatternMatch1(byte[] buffer, int position, string pattern)
{
    int i = 0;

    foreach (char c in pattern)
    {
        if (buffer[position + i++] != c)
        {
            return false;
        }
    }

    return true;
}    

version 2 with LINQ (suggested by Resharper)

private static bool PatternMatch2(byte[] buffer, int position, string pattern)
{
    int i = 0;
    return pattern.All(c => buffer[position + i++] == c);
}

version 3 with LINQ

private static bool PatternMatch3(byte[] buffer, int position, string pattern)
{
    return !pattern.Where((t, i) => buffer[position + i] != t).Any();
}

version 4 using lambda

private static bool PatternMatch4(byte[] buffer, int position, string pattern, Func<char, byte, bool> predicate)
{
    int i = 0;

    foreach (char c in pattern)
    {
        if (predicate(c, buffer[position + i++]))
        {
            return false;
        }
    }

    return true;
}

and here is the usage with large buffer

const int SIZE = 1024 * 1024 * 50;

byte[] buffer = new byte[SIZE];

for (int i = 0; i < SIZE - 3; ++i)
{
    if (PatternMatch1(buffer, i, "xxx"))
    {
        Console.WriteLine(i);
    }
}

Calling PatternMatch2 or PatternMatch3 is phenomenally slow. It takes about 8 seconds for PatternMatch3 and about 4 seconds for PatternMatch2, while calling PatternMatch1 takes about 0.6. As far as I understand it is the same code and I see no difference. Any ideas?

like image 213
axe Avatar asked Oct 10 '12 12:10

axe


People also ask

Why is LINQ slow?

It is slightly slower This is acknowledged in the Microsoft documentation: LINQ syntax is typically less efficient than a foreach loop. It's good to be aware of any performance tradeoff that might occur when you use LINQ to improve the readability of your code.

Is LINQ in C# actually slow?

Conclusion. It would seem the performance of LINQ is similar to more basic constructs in C#, except for that notable case where Count was significantly slower. If performance is important it's crucial to do benchmarks on your application rather than relying on anecdotes (including this one).

Is LINQ slower than SQL?

We can see right away that LINQ is a lot slower than raw SQL, but compiled LINQ is a bit faster. Note that results are in microseconds; real-world queries may take tens or even hundreds of milliseconds, so LINQ overhead will be hardly noticeable.


4 Answers

Mark Byers and Marco Mp are right about the extra calls overhead. However, there is another reason here: many object allocations, due to a closure.

The compiler creates a new object on each iteration, storing the current values of buffer, position and i that the predicate can use. Here is a dotTrace snapshot of PatternMatch2 showing this. <>c_DisplayClass1 is the compiler-generated class.

(Note that the numbers are huge because I'm using tracing profiling, which adds overhead. However, it's the same overhead per call, so the overall percentages are correct).

enter image description here

like image 114
Julien Lebosquain Avatar answered Oct 01 '22 18:10

Julien Lebosquain


Well let's take the Where operator.

It's implementation is almost(*) like:

public IEnumerable<T> Where(this IEnumerable<T> input, Func<T, bool> fn)
{
    foreach(T i in input) 
        if (fn(i))
            yield return i;
}

This means, for each loop over the IEnumerable an iterator object is created - note that you have SIZE - n of these allocations because you do those many LINQ queries.

Then for each character in the pattern you have:

  1. A function call to the enumerator
  2. A function call to the predicate

The second is a call to a delegate which costs roughly double the calling cost of a typical virtual call (where in the first version you have no extra calls apart the array deindexing.

If you really want brute performance, you probably want to get as "older-style" code as possible. In this case I would even replace that foreach in method 1 with a plain for (at the very least if it does not serve to optimize, it makes it more readable since you are keeping track of the index anyway).

It's also more readable in this case, and it shows that Resharper suggestions are sometimes debatable.

(*) I said almost because it uses a proxy method to check that the input enumerable is not null and throw the exception earlier than the time the collection is enumerated -- it's a small detail which does not invalidate what I wrote before, highlighting here for completeness.

like image 42
Marco Mp Avatar answered Oct 01 '22 17:10

Marco Mp


The difference is that the LINQ version has extra function calls. Internally in LINQ your lambda function is called in a loop.

While the extra call might be optimized away by the JIT compiler, it's not guaranteed and it can add a small overhead. In most cases the extra overhead won't matter, but because your function is very simple and is called an extremely large number of times, even a small overhead can quickly add up. In my experience LINQ code generally is a little bit slower than equivalent for loops. That's the performance price you often pay for a more high level syntax.

In this situation, I would recommend sticking with the explicit loop.

While you're optimizing this section of code, you might also want to consider a more efficient searching algorithm. Your algorithm is worst case O(n*m) but better algorithms exist.

like image 25
Mark Byers Avatar answered Oct 01 '22 17:10

Mark Byers


LINQ's main goal when applied to collections is its simplicity. If you want performance, you should avoid LINQ entirely. Also to enumerate an array you just increment an index variable, while LINQ needs to set up an entire enumerator object.

like image 23
Wormbo Avatar answered Oct 01 '22 16:10

Wormbo