Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ Performance

Tags:

linq

What exactly is happening behind the scenes in a LINQ query against an object collection? Is it just syntactical sugar or is there something else happening making it more of an efficient query?

like image 905
Crios Avatar asked Dec 04 '22 14:12

Crios


1 Answers

Do you mean in terms of a query expression, or what the query does behind the scenes?

Query expressions are expanded into "normal" C# first. For example:

var query = from x in source
            where x.Name == "Fred"
            select x.Age;

is translated to:

var query = source.Where(x => x.Name == "Fred")
                  .Select(x => x.Age);

The exact meaning of this depends on the type of source of course... in LINQ to Objects, it typically implements IEnumerable<T> and the Enumerable extension methods come into play... but it could be a different set of extension methods. (LINQ to SQL would use the Queryable extension methods, for example.)

Now, suppose we are using LINQ to Objects... after extension method expansion, the above code becomes:

var query = Enumerable.Select(Enumerable.Where(source, x => x.Name == "Fred"),
                              x => x.Age);

Next the implementations of Select and Where become important. Leaving out error checking, they're something like this:

public static IEnumerable<T> Where<T>(this IEnumerable<T> source,
                                      Func<T, bool> predicate)
{
    foreach (T element in source)
    {
        if (predicate(element))
        {
            yield return element;
        }
    }
}

public static IEnumerable<TResult> Select<TSource, TResult>
    (this IEnumerable<TSource> source,
     Func<TSource, TResult> selector)
{
    foreach (TSource element in source)
    {
        yield return selector(element);
    }
}

Next there's the expansion of iterator blocks into state machines, which I won't go into here but which I have an article about.

Finally, there's the conversion of lambda expressions into extra methods + appropriate delegate instance creation (or expression trees, depending on the signatures of the methods called).

So basically LINQ uses a lot of clever features of C#:

  • Lambda expression conversions (into delegate instances and expression trees)
  • Extension methods
  • Type inference for generic methods
  • Iterator blocks
  • Often anonymous types (for use in projections)
  • Often implicit typing for local variables
  • Query expression translation

However, the individual operations are quite simple - they don't perform indexing etc. Joins and groupings are done using hash tables, but straightforward queries like "where" are just linear. Don't forget that LINQ to Objects usually just treats the data as a forward-only readable sequence - it can't do things like a binary search.

Normally I'd expect hand-written queries to be marginally faster than LINQ to Objects as there are fewer layers of abstraction, but they'll be less readable and the performance difference usually won't be significant.

As ever for performance questions: when in doubt, measure!

like image 82
Jon Skeet Avatar answered Feb 01 '23 09:02

Jon Skeet