LINQ Performance

Question

What exactly is happening behind the scenes in a LINQ query against an object collection? Is it just syntactical sugar or is there something else happening making it more of an efficient query?

Jon Skeet · Accepted Answer

Do you mean in terms of a query expression, or what the query does behind the scenes?

Query expressions are expanded into "normal" C# first. For example:

var query = from x in source
            where x.Name == "Fred"
            select x.Age;

is translated to:

var query = source.Where(x => x.Name == "Fred")
                  .Select(x => x.Age);

The exact meaning of this depends on the type of source of course... in LINQ to Objects, it typically implements IEnumerable<T> and the Enumerable extension methods come into play... but it could be a different set of extension methods. (LINQ to SQL would use the Queryable extension methods, for example.)

Now, suppose we are using LINQ to Objects... after extension method expansion, the above code becomes:

var query = Enumerable.Select(Enumerable.Where(source, x => x.Name == "Fred"),
                              x => x.Age);

Next the implementations of Select and Where become important. Leaving out error checking, they're something like this:

public static IEnumerable<T> Where<T>(this IEnumerable<T> source,
                                      Func<T, bool> predicate)
{
    foreach (T element in source)
    {
        if (predicate(element))
        {
            yield return element;
        }
    }
}

public static IEnumerable<TResult> Select<TSource, TResult>
    (this IEnumerable<TSource> source,
     Func<TSource, TResult> selector)
{
    foreach (TSource element in source)
    {
        yield return selector(element);
    }
}

Next there's the expansion of iterator blocks into state machines, which I won't go into here but which I have an article about.

Finally, there's the conversion of lambda expressions into extra methods + appropriate delegate instance creation (or expression trees, depending on the signatures of the methods called).

So basically LINQ uses a lot of clever features of C#:

Lambda expression conversions (into delegate instances and expression trees)
Extension methods
Type inference for generic methods
Iterator blocks
Often anonymous types (for use in projections)
Often implicit typing for local variables
Query expression translation

However, the individual operations are quite simple - they don't perform indexing etc. Joins and groupings are done using hash tables, but straightforward queries like "where" are just linear. Don't forget that LINQ to Objects usually just treats the data as a forward-only readable sequence - it can't do things like a binary search.

Normally I'd expect hand-written queries to be marginally faster than LINQ to Objects as there are fewer layers of abstraction, but they'll be less readable and the performance difference usually won't be significant.

As ever for performance questions: when in doubt, measure!

LINQ Performance

Tags:

linq

Crios

1 Answers

Jon Skeet

Recent Activity

Donate For Us

LINQ Performance

Tags:

linq

Crios

1 Answers

Jon Skeet

Related questions

Recent Activity

Donate For Us