Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

linq styling, chaining where clause vs and operator

Is there a (logical/performance) difference to writing:

ATable.Where(x=> condition1 && condition2 && condition3)

or

ATable.Where(x=>condition1).Where(x=>condition2).Where(x=>condition3)

I've been using the former but realised that with the latter, I can read and copy parts of a query out to use somewhere else easier. Any thoughts?

like image 776
Joe Avatar asked Mar 04 '11 01:03

Joe


1 Answers

Short answer
You should do what you feel is more readable and maintainable in your application as both will evaluate to the same collection.

Long answer quite long

Linq To Objects
ATable.Where(x=> condition1 && condition2 && condition3) For this example Since there is only one predicate statement the compiler will only needs to generate one delegate and one compiler generated method.
From reflector

if (CS$<>9__CachedAnonymousMethodDelegate4 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate4 = new Func<ATable, bool>(null, (IntPtr) <Main>b__0);
}
Enumerable.Where<ATable>(tables, CS$<>9__CachedAnonymousMethodDelegate4).ToList<ATable>();

The compiler generated method:

[CompilerGenerated]
private static bool <Main>b__0(ATable m)
{
    return ((m.Prop1 && m.Prop2) && m.Prop3);
}

As you can see there is only one call into Enumerable.Where<T> with the delegate as expected since there was only one Where extension method.


ATable.Where(x=>condition1).Where(x=>condition2).Where(x=>condition3) now for this example a lot more code is generated.

    if (CS$<>9__CachedAnonymousMethodDelegate5 == null)
    {
        CS$<>9__CachedAnonymousMethodDelegate5 = new Func<ATable, bool>(null, (IntPtr) <Main>b__1);
    }
    if (CS$<>9__CachedAnonymousMethodDelegate6 == null)
    {
        CS$<>9__CachedAnonymousMethodDelegate6 = new Func<ATable, bool>(null, (IntPtr) <Main>b__2);
    }
    if (CS$<>9__CachedAnonymousMethodDelegate7 == null)
    {
        CS$<>9__CachedAnonymousMethodDelegate7 = new Func<ATable, bool>(null, (IntPtr) <Main>b__3);
    }
    Enumerable.Where<ATable>(Enumerable.Where<ATable>(Enumerable.Where<ATable>(tables, CS$<>9__CachedAnonymousMethodDelegate5), CS$<>9__CachedAnonymousMethodDelegate6), CS$<>9__CachedAnonymousMethodDelegate7).ToList<ATable>();

Since we have three chained Extension methods we also get three Func<T>s and also three compiler generated methods.

[CompilerGenerated]
private static bool <Main>b__1(ATable m)
{
    return m.Prop1;
}

[CompilerGenerated]
private static bool <Main>b__2(ATable m)
{
    return m.Prop2;
}

[CompilerGenerated]
private static bool <Main>b__3(ATable m)
{
    return m.Prop3;
}

Now this looks like this should be slower since heck there is a ton more code. However since all execution is deferred until GetEnumerator() is called I doubt any noticeable difference will present itself.

Some Gotchas that could effect performance

  • Any call to GetEnumerator in the chain will cause a the collection to be iterated. ATable.Where().ToList().Where().ToList() will result in an iteration of the collection with the first predicate when the ToList is called and then another iteration with the second ToList. Try to keep the GetEnumerator called to the very last moment to reduce the number of times the collection is iterated.

Linq To Entities
Since we are using IQueryable<T> now our compiler generated code is a bit different as we are using Expresssion<Func<T, bool>> instead of our normal Func<T, bool>

Example in all in one.
var allInOneWhere = entityFrameworkEntities.MovieSets.Where(m => m.Name == "The Matrix" && m.Id == 10 && m.GenreType_Value == 3);

This generates one heck of a statement.

IQueryable<MovieSet> allInOneWhere = Queryable.Where<MovieSet>(entityFrameworkEntities.MovieSets, Expression.Lambda<Func<MovieSet, bool>>(Expression.AndAlso(Expression.AndAlso(Expression.Equal(Expression.Property(CS$0$0000 = Expression.Parameter(typeof(MovieSet), "m"), (MethodInfo) methodof(MovieSet.get_Name)), ..tons more stuff...ParameterExpression[] { CS$0$0000 }));

The most notable is that we end up with one Expression tree that is parsed down to Expression.AndAlso pieces. And also like expected we only have one call to Queryable.Where

var chainedWhere = entityFrameworkEntities.MovieSets.Where(m => m.Name == "The Matrix").Where(m => m.Id == 10).Where(m => m.GenreType_Value == 3);

I wont even bother pasting in the compiler code for this, way to long. But in short we end up with Three calls to Queryable.Where(Queryable.Where(Queryable.Where())) and three expressions. This again is expected as we have three chained Where clauses.

Generated Sql
Like IEnumerable<T> IQueryable<T> also does not execute until the enumerator is called. Because of this we can be happy to know that both produce the same exact sql statement:

SELECT 
[Extent1].[AtStore_Id] AS [AtStore_Id], 
[Extent1].[GenreType_Value] AS [GenreType_Value], 
[Extent1].[Id] AS [Id], 
[Extent1].[Name] AS [Name]
FROM [dbo].[MovieSet] AS [Extent1]
WHERE (N'The Matrix' = [Extent1].[Name]) AND (10 = [Extent1].[Id]) AND (3 = [Extent1].[GenreType_Value])

Some Gotchas that could effect performance

  • Any call to GetEnumerator in the chain will cause a call out to sql, e.g. ATable.Where().ToList().Where() will actually query sql for all records matching the first predicate and then filter the list with linq to objects with the second predicate.
  • Since you mention extracting the predicates to use else where, make sure they are in the form of Expression<Func<T, bool>> and not simply Func<T, bool>. The first can be parsed to an expression tree and converted into valid sql, the second will trigger ALL OBJECTS returned and the Func<T, bool> will execute on that collection.

I hope this was a bit helpful to answer your question.

like image 140
Mark Coleman Avatar answered Oct 06 '22 17:10

Mark Coleman