Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ: One "where" clause versus multiple chained "where clauses"

Tags:

c#

.net

linq

I was wondering about the next thing:

I can query my database using LINQ To Entities, like this:

GetAll().Where(
      x => x.SomeProperty == 'Yes'
      && x.SomeOtherProperty == 'No')
.ToList();

While I see some of my colleagues change these two WHERE-clauses, like this:

GetAll().Where(x => x.SomeProperty == 'Yes')
      .Where(x.SomeOtherProperty == 'No')
.ToList();

Both queries should produce the same output, but I was wondering if one of the two has any advantages/disadvantages. For instance, does one method produce a slower database query, or would they generate the same sql-query?

like image 290
Steven Lemmens Avatar asked Apr 27 '15 15:04

Steven Lemmens


2 Answers

I set up a test project using the model and context described in this article and logged the SQL for two different queries, following the pattern in your question. The queries I made were:

db.Blogs
    .Where(b => b.BlogId == 0)
    .Where(b => b.Name == "Foo");

and

db.Blogs
    .Where(b => b.BlogId == 0 && b.Name == "Foo");

The generated SQL for both queries is identical:

SELECT 
    [Extent1].[BlogId] AS [BlogId], 
    [Extent1].[Name] AS [Name]
    FROM [dbo].[Blogs] AS [Extent1]
    WHERE (0 = [Extent1].[BlogId]) AND (N'Foo' = [Extent1].[Name])

So it seems (at least for simple cases like this), there is no significant performance difference one way or the other. I guess you could argue it takes the expression visitor a bit longer to examine your tree if you use the multiple Where approach, but this is negligible in the long run.

like image 62
Asad Saeeduddin Avatar answered Sep 26 '22 01:09

Asad Saeeduddin


I created the following test case in LINQPad.

void Main()
{
    var listOfObjects = Enumerable.Range(0, 100000).Select(x => new TestClass() { SomeProperty = x.ToString() }).ToArray();
    var iterations = DateTime.UtcNow.Day * 50;
    Console.WriteLine("Doing {0} iterations", iterations);

    var sw = Stopwatch.StartNew();
    for(int i = 0; i < iterations; i++)
    {
        var filteredList = listOfObjects.Where(x => x.SomeProperty.Contains('0') && x.SomeProperty.Contains('1')).ToArray();
    }
    sw.Stop();

    Console.WriteLine(sw.ElapsedMilliseconds);

    sw.Reset();
    sw.Start();
    for(int i = 0; i < iterations; i++)
    {
        var filteredList = listOfObjects.Where(x => x.SomeProperty.Contains('0')).Where(x => x.SomeProperty.Contains('1')).ToArray();
    }
    sw.Stop();
    Console.WriteLine(sw.ElapsedMilliseconds);
}

public class TestClass
{
    public string SomeProperty { get; set; }
    public string SomeOtherProperty { get; set; }
}

It yields the following results for 1350 iterations.

  • For the && clause, 19415 ms
  • For the two Where clauses, 19596 ms

This seems to indicate that the first one is a little bit faster. This could just be the compiler doing some magic behind the scenes but by calculating the list of TestClass as well as the number of iterations at run time, it shouldn't be able to do any.

To be sure with actual SQL, you should take a look at the IL, but based on mine, it looks like the effective code is identical.

<Main>b__2:
IL_0000:  ldarg.0     
IL_0001:  callvirt    UserQuery+TestClass.get_SomeProperty
IL_0006:  ldc.i4.s    30 
IL_0008:  call        System.Linq.Enumerable.Contains
IL_000D:  brfalse.s   IL_001E
IL_000F:  ldarg.0     
IL_0010:  callvirt    UserQuery+TestClass.get_SomeProperty
IL_0015:  ldc.i4.s    31 
IL_0017:  call        System.Linq.Enumerable.Contains
IL_001C:  br.s        IL_001F
IL_001E:  ldc.i4.0    
IL_001F:  nop         
IL_0020:  stloc.0     // CS$1$0000
IL_0021:  br.s        IL_0023
IL_0023:  ldloc.0     // CS$1$0000
IL_0024:  ret         

<Main>b__3:
IL_0000:  ldarg.0     
IL_0001:  callvirt    UserQuery+TestClass.get_SomeProperty
IL_0006:  ldc.i4.s    30 
IL_0008:  call        System.Linq.Enumerable.Contains
IL_000D:  stloc.0     // CS$1$0000
IL_000E:  br.s        IL_0010
IL_0010:  ldloc.0     // CS$1$0000
IL_0011:  ret         

<Main>b__4:
IL_0000:  ldarg.0     
IL_0001:  callvirt    UserQuery+TestClass.get_SomeProperty
IL_0006:  ldc.i4.s    31 
IL_0008:  call        System.Linq.Enumerable.Contains
IL_000D:  stloc.0     // CS$1$0000
IL_000E:  br.s        IL_0010
IL_0010:  ldloc.0     // CS$1$0000
IL_0011:  ret         

The Where clause that uses && is the first bit of IL, showing two calls to Contains in the anonymous method. The test that uses the double Where clause calls two different anonymous methods, however they appear to be identical, except for the parameter passed to Contains. Looking at the main method, we can see the calls made, and see that the overhead is minimal. It results in slightly slower code when using two Where clauses.

My suggestion is to run a test (like I coded above) against the actual SQL database and see what the performance looks like. The generated IL may be almost identical, but could be significantly different when translated to SQL.

like image 42
Pete Garafano Avatar answered Sep 26 '22 01:09

Pete Garafano