Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Linq to Objects chaining where clause VS && performance hit is that insignificant?

following this question:
Should I use two “where” clauses or “&&” in my LINQ query?
Can or should I join two Where clauses together in a LINQ Query?
linq styling, chaining where clause vs and operator
Jon Skeet: blog post

Most answers said that the Linq To Objects performance hit in chaining where clause vs && in a single lambda expression is negligible so its up to your coding style to decide which one to use.

I started by looking at IL assembly, you can definitely see that chaining where clause will result in Where extension being called 2 times the and the input of the second call is the result of the first.

var numbers = new List<int>() { 1, 2 ,3,4,5,6,7,8,9,10};
IEnumerable<int> query = numbers.Where(x=> x>2).Where(x => x<5);

//The IL

IL_005B:  ldloc.0     // numbers
IL_005C:  ldsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate3
IL_0061:  brtrue.s    IL_0076
IL_0063:  ldnull      
IL_0064:  ldftn       b__1
IL_006A:  newobj      System.Func<System.Int32,System.Boolean>..ctor
IL_006F:  stsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate3
IL_0074:  br.s        IL_0076
IL_0076:  ldsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate3
IL_007B:  call        System.Linq.Enumerable.Where <-----------First Call
IL_0080:  ldsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate4
IL_0085:  brtrue.s    IL_009A
IL_0087:  ldnull      
IL_0088:  ldftn       b__2
IL_008E:  newobj      System.Func<System.Int32,System.Boolean>..ctor
IL_0093:  stsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate4
IL_0098:  br.s        IL_009A
IL_009A:  ldsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate4
IL_009F:  call        System.Linq.Enumerable.Where <------------Second Call
IL_00A4:  stloc.1     // query
b__1:
IL_0000:  ldarg.0     
IL_0001:  ldc.i4.2    
IL_0002:  cgt         
IL_0004:  stloc.0     // CS$1$0000
IL_0005:  br.s        IL_0007
IL_0007:  ldloc.0     // CS$1$0000
IL_0008:  ret       
 b__2:
 IL_0000:  ldarg.0     
 IL_0001:  ldc.i4.5    
 IL_0002:  clt         
 IL_0004:  stloc.0     // CS$1$0000
 IL_0005:  br.s        IL_0007
 IL_0007:  ldloc.0     // CS$1$0000
 IL_0008:  ret    

Then I run a simple bench mark on Win7 .Net 3.5 and 4.0

  static void Main(string[] args)
{               
    int size = 10000000;
    Console.WriteLine("chain clauses");
    RunTests(size,true);

    Console.WriteLine("use and");
    RunTests(size,false);               
}


static void RunTests(int size, bool chainClauses)
{
    for (int i = 1; i <= 10; i++)
     {
        if (chainClauses)
            RunTestChaining(i, size);
        else
            RunTestAnd(i, size);
        }
    }

    static void RunTestChaining(int depth, int size)
    {
        IEnumerable<string> input = Enumerable.Repeat("value", size);                      

        switch (depth)
        {
            case 1:
                input = input.Where(x => !x.Equals("1"));
                break;
            case 2:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2"));
                break;
            case 3:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3"));
                break;
            case 4:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4"));
                break;
            case 5:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5"));
                break;
            case 6:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6"));
                break;
            case 7:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7"));
                break;
            case 8:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7")).Where(x => !x.Equals("8"));
                break;
            case 9:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7")).Where(x => !x.Equals("8")).Where(x => !x.Equals("9"));
                break;
            case 10:
                input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7")).Where(x => !x.Equals("8")).Where(x => !x.Equals("9")).Where(x => !x.Equals("10"));
                break;
        }

        Stopwatch sw = Stopwatch.StartNew();
        var count = input.Count();
        sw.Stop();
        Console.WriteLine("Depth: {0} Count: {1} Time: {2}ms",
                          depth, count, sw.ElapsedMilliseconds);
    }

static void RunTestAnd(int depth, int size )
    {
        IEnumerable<string> input = Enumerable.Repeat("value", size);
        Func<string, bool> predicate = x => true;
        switch (depth)
        {
            case 1:
                predicate = x => !x.Equals("1");
                break;
            case 2:
                predicate = x => !x.Equals("1") && !x.Equals("2");
                break;
            case 3:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3");
                break;
            case 4:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3");
                break;
            case 5:
                 predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5");
                break;
            case 6:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6");
                break;
            case 7:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7");
                break;
            case 8:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7") && !x.Equals("8");
                break;
            case 9:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7") && !x.Equals("8") && !x.Equals("9");
                break;
            case 10:
                predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7") && !x.Equals("8") && !x.Equals("9") && !x.Equals("10");
                break;
        }
        input = input.Where(predicate);

        Stopwatch sw = Stopwatch.StartNew();
        var count = input.Count();
        sw.Stop();
        Console.WriteLine("Depth: {0} Count: {1} Time: {2}ms",
                          depth, count, sw.ElapsedMilliseconds);
    } 

And the results:

// .Net 3.5                                //.Net 4.0
chain clauses                               chain clauses
Depth: 1 Count: 10000000 Time: 181ms        Depth: 1 Count: 10000000 Time: 216ms
Depth: 2 Count: 10000000 Time: 248ms        Depth: 2 Count: 10000000 Time: 278ms
Depth: 3 Count: 10000000 Time: 315ms        Depth: 3 Count: 10000000 Time: 347ms
Depth: 4 Count: 10000000 Time: 378ms        Depth: 4 Count: 10000000 Time: 437ms
Depth: 5 Count: 10000000 Time: 443ms        Depth: 5 Count: 10000000 Time: 509ms
Depth: 6 Count: 10000000 Time: 514ms        Depth: 6 Count: 10000000 Time: 573ms
Depth: 7 Count: 10000000 Time: 579ms        Depth: 7 Count: 10000000 Time: 649ms
Depth: 8 Count: 10000000 Time: 644ms        Depth: 8 Count: 10000000 Time: 727ms
Depth: 9 Count: 10000000 Time: 978ms        Depth: 9 Count: 10000000 Time: 1278ms
Depth: 10 Count: 10000000 Time: 1546ms      Depth: 10 Count: 10000000 Time: 1075ms
use and                                     use and
Depth: 1 Count: 10000000 Time: 181ms        Depth: 1 Count: 10000000 Time: 202ms
Depth: 2 Count: 10000000 Time: 200ms        Depth: 2 Count: 10000000 Time: 234ms
Depth: 3 Count: 10000000 Time: 228ms        Depth: 3 Count: 10000000 Time: 267ms
Depth: 4 Count: 10000000 Time: 245ms        Depth: 4 Count: 10000000 Time: 303ms
Depth: 5 Count: 10000000 Time: 267ms        Depth: 5 Count: 10000000 Time: 335ms
Depth: 6 Count: 10000000 Time: 289ms        Depth: 6 Count: 10000000 Time: 364ms
Depth: 7 Count: 10000000 Time: 312ms        Depth: 7 Count: 10000000 Time: 397ms
Depth: 8 Count: 10000000 Time: 326ms        Depth: 8 Count: 10000000 Time: 432ms
Depth: 9 Count: 10000000 Time: 366ms        Depth: 9 Count: 10000000 Time: 462ms
Depth: 10 Count: 10000000 Time: 375ms       Depth: 10 Count: 10000000 Time: 493ms

According to these results there is a significant performance deterioration, which points to a conclusion that you should avoid chaining where clause in Linq to Objects.
Or There Is something I am missing?

like image 453
makc Avatar asked Jun 27 '13 07:06

makc


People also ask

Can we use multiple where clause in LINQ?

Filter collections using Where clause in C#. A single query expression may have multiple where clauses.

What is LINQ to objects in C#?

The term "LINQ to Objects" refers to the use of LINQ queries with any IEnumerable or IEnumerable<T> collection directly, without the use of an intermediate LINQ provider or API such as LINQ to SQL or LINQ to XML. You can use LINQ to query any enumerable collections such as List<T>, Array, or Dictionary<TKey,TValue>.

What are the uses of LINQ to objects?

In a nutshell, LINQ to Objects provides the developer with the means to conduct queries against an in-memory collection of objects. The techniques used to query against such collections of objects are similar to but simpler than the approaches used to conduct queries against a relational database using SQL statements.


1 Answers

Yes, it would be good advice to try to avoid chaining .Where() clauses on top of one another when it's a no-brainer and you have the opportunity, like in this microbenchmark.

Note that .NET's LINQ-to-Objects implementation is smart enough to combine the predicates for you when you do this. It's not as bad as it could have been, but running the fancy chained delegate is not going to be as fast or as elegant as running a single delegate with all the logic inside of it.

However, suppose you have an arbitrary instance of an IEnumerable<T>, which may or may not be the result of some .Where() method already, and you're writing a method that needs to filter it on some predicate.

Based on the results of this microbenchmark, are you really going to refactor your entire code base so that you can "avoid chaining where clause in Linq to objects", or are you just going to add one more .Where() and move on with your life?

As always, if you run actual performance testing on an application that has performance problems (meaning, performance is outside of the range defined as "acceptable"), and the results indicate that chaining .Where() clauses is a bottleneck, then you might be more justified to try and rethink what's going on.

Also, out of curiosity, I changed the duplicated "3" clauses to "4" clauses in RunTestAnd, ran your code on my 4-core Windows 8.1 x64 machine in .NET 4.5.1 (release mode, no debugger), then ran it after changing the line that initializes input to:

var input = Enumerable.Repeat("value", size).AsParallel();

with these results (output massaged for prettiness, but I vouch that these were the actual numbers):

chain clauses (parallel)                  chain clauses (serial)
Depth:  1 Count: 10000000 Time:  284ms    Depth:  1 Count: 10000000 Time:  185ms
Depth:  2 Count: 10000000 Time:  241ms    Depth:  2 Count: 10000000 Time:  248ms
Depth:  3 Count: 10000000 Time:  267ms    Depth:  3 Count: 10000000 Time:  308ms
Depth:  4 Count: 10000000 Time:  256ms    Depth:  4 Count: 10000000 Time:  370ms
Depth:  5 Count: 10000000 Time:  371ms    Depth:  5 Count: 10000000 Time:  432ms
Depth:  6 Count: 10000000 Time:  345ms    Depth:  6 Count: 10000000 Time:  667ms
Depth:  7 Count: 10000000 Time:  342ms    Depth:  7 Count: 10000000 Time:  569ms
Depth:  8 Count: 10000000 Time:  465ms    Depth:  8 Count: 10000000 Time:  627ms
Depth:  9 Count: 10000000 Time:  434ms    Depth:  9 Count: 10000000 Time:  862ms
Depth: 10 Count: 10000000 Time:  416ms    Depth: 10 Count: 10000000 Time: 1235ms
use and (parallel)                        use and (serial)
Depth:  1 Count: 10000000 Time:  263ms    Depth:  1 Count: 10000000 Time:  182ms
Depth:  2 Count: 10000000 Time:  265ms    Depth:  2 Count: 10000000 Time:  217ms
Depth:  3 Count: 10000000 Time:  239ms    Depth:  3 Count: 10000000 Time:  228ms
Depth:  4 Count: 10000000 Time:  255ms    Depth:  4 Count: 10000000 Time:  255ms
Depth:  5 Count: 10000000 Time:  272ms    Depth:  5 Count: 10000000 Time:  275ms
Depth:  6 Count: 10000000 Time:  255ms    Depth:  6 Count: 10000000 Time:  295ms
Depth:  7 Count: 10000000 Time:  268ms    Depth:  7 Count: 10000000 Time:  320ms
Depth:  8 Count: 10000000 Time:  268ms    Depth:  8 Count: 10000000 Time:  339ms
Depth:  9 Count: 10000000 Time:  305ms    Depth:  9 Count: 10000000 Time:  363ms
Depth: 10 Count: 10000000 Time:  267ms    Depth: 10 Count: 10000000 Time:  386ms

What this suggests to me is that if you do find that your 10-deep chain of .Where() clauses is a bottleneck, there's no easy straight-forward refactoring, and your specific usage scenario allows it, consider trying PLINQ.

like image 92
Joe Amenta Avatar answered Oct 29 '22 15:10

Joe Amenta