following this question:
Should I use two “where” clauses or “&&” in my LINQ query?
Can or should I join two Where clauses together in a LINQ Query?
linq styling, chaining where clause vs and operator
Jon Skeet: blog post
Most answers said that the Linq To Objects
performance hit in chaining where clause vs && in a single lambda expression is negligible so its up to your coding style to decide which one to use.
I started by looking at IL assembly, you can definitely see that chaining where clause will result in Where extension being called 2 times the and the input of the second call is the result of the first.
var numbers = new List<int>() { 1, 2 ,3,4,5,6,7,8,9,10};
IEnumerable<int> query = numbers.Where(x=> x>2).Where(x => x<5);
//The IL
IL_005B: ldloc.0 // numbers
IL_005C: ldsfld UserQuery.CS$<>9__CachedAnonymousMethodDelegate3
IL_0061: brtrue.s IL_0076
IL_0063: ldnull
IL_0064: ldftn b__1
IL_006A: newobj System.Func<System.Int32,System.Boolean>..ctor
IL_006F: stsfld UserQuery.CS$<>9__CachedAnonymousMethodDelegate3
IL_0074: br.s IL_0076
IL_0076: ldsfld UserQuery.CS$<>9__CachedAnonymousMethodDelegate3
IL_007B: call System.Linq.Enumerable.Where <-----------First Call
IL_0080: ldsfld UserQuery.CS$<>9__CachedAnonymousMethodDelegate4
IL_0085: brtrue.s IL_009A
IL_0087: ldnull
IL_0088: ldftn b__2
IL_008E: newobj System.Func<System.Int32,System.Boolean>..ctor
IL_0093: stsfld UserQuery.CS$<>9__CachedAnonymousMethodDelegate4
IL_0098: br.s IL_009A
IL_009A: ldsfld UserQuery.CS$<>9__CachedAnonymousMethodDelegate4
IL_009F: call System.Linq.Enumerable.Where <------------Second Call
IL_00A4: stloc.1 // query
b__1:
IL_0000: ldarg.0
IL_0001: ldc.i4.2
IL_0002: cgt
IL_0004: stloc.0 // CS$1$0000
IL_0005: br.s IL_0007
IL_0007: ldloc.0 // CS$1$0000
IL_0008: ret
b__2:
IL_0000: ldarg.0
IL_0001: ldc.i4.5
IL_0002: clt
IL_0004: stloc.0 // CS$1$0000
IL_0005: br.s IL_0007
IL_0007: ldloc.0 // CS$1$0000
IL_0008: ret
Then I run a simple bench mark on Win7 .Net 3.5 and 4.0
static void Main(string[] args)
{
int size = 10000000;
Console.WriteLine("chain clauses");
RunTests(size,true);
Console.WriteLine("use and");
RunTests(size,false);
}
static void RunTests(int size, bool chainClauses)
{
for (int i = 1; i <= 10; i++)
{
if (chainClauses)
RunTestChaining(i, size);
else
RunTestAnd(i, size);
}
}
static void RunTestChaining(int depth, int size)
{
IEnumerable<string> input = Enumerable.Repeat("value", size);
switch (depth)
{
case 1:
input = input.Where(x => !x.Equals("1"));
break;
case 2:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2"));
break;
case 3:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3"));
break;
case 4:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4"));
break;
case 5:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5"));
break;
case 6:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6"));
break;
case 7:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7"));
break;
case 8:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7")).Where(x => !x.Equals("8"));
break;
case 9:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7")).Where(x => !x.Equals("8")).Where(x => !x.Equals("9"));
break;
case 10:
input = input.Where(x => !x.Equals("1")).Where(x => !x.Equals("2")).Where(x => !x.Equals("3")).Where(x => !x.Equals("4")).Where(x => !x.Equals("5")).Where(x => !x.Equals("6")).Where(x => !x.Equals("7")).Where(x => !x.Equals("8")).Where(x => !x.Equals("9")).Where(x => !x.Equals("10"));
break;
}
Stopwatch sw = Stopwatch.StartNew();
var count = input.Count();
sw.Stop();
Console.WriteLine("Depth: {0} Count: {1} Time: {2}ms",
depth, count, sw.ElapsedMilliseconds);
}
static void RunTestAnd(int depth, int size )
{
IEnumerable<string> input = Enumerable.Repeat("value", size);
Func<string, bool> predicate = x => true;
switch (depth)
{
case 1:
predicate = x => !x.Equals("1");
break;
case 2:
predicate = x => !x.Equals("1") && !x.Equals("2");
break;
case 3:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3");
break;
case 4:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3");
break;
case 5:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5");
break;
case 6:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6");
break;
case 7:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7");
break;
case 8:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7") && !x.Equals("8");
break;
case 9:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7") && !x.Equals("8") && !x.Equals("9");
break;
case 10:
predicate = x => !x.Equals("1") && !x.Equals("2") && !x.Equals("3")&&!x.Equals("3")&& !x.Equals("5") && !x.Equals("6") && !x.Equals("7") && !x.Equals("8") && !x.Equals("9") && !x.Equals("10");
break;
}
input = input.Where(predicate);
Stopwatch sw = Stopwatch.StartNew();
var count = input.Count();
sw.Stop();
Console.WriteLine("Depth: {0} Count: {1} Time: {2}ms",
depth, count, sw.ElapsedMilliseconds);
}
And the results:
// .Net 3.5 //.Net 4.0
chain clauses chain clauses
Depth: 1 Count: 10000000 Time: 181ms Depth: 1 Count: 10000000 Time: 216ms
Depth: 2 Count: 10000000 Time: 248ms Depth: 2 Count: 10000000 Time: 278ms
Depth: 3 Count: 10000000 Time: 315ms Depth: 3 Count: 10000000 Time: 347ms
Depth: 4 Count: 10000000 Time: 378ms Depth: 4 Count: 10000000 Time: 437ms
Depth: 5 Count: 10000000 Time: 443ms Depth: 5 Count: 10000000 Time: 509ms
Depth: 6 Count: 10000000 Time: 514ms Depth: 6 Count: 10000000 Time: 573ms
Depth: 7 Count: 10000000 Time: 579ms Depth: 7 Count: 10000000 Time: 649ms
Depth: 8 Count: 10000000 Time: 644ms Depth: 8 Count: 10000000 Time: 727ms
Depth: 9 Count: 10000000 Time: 978ms Depth: 9 Count: 10000000 Time: 1278ms
Depth: 10 Count: 10000000 Time: 1546ms Depth: 10 Count: 10000000 Time: 1075ms
use and use and
Depth: 1 Count: 10000000 Time: 181ms Depth: 1 Count: 10000000 Time: 202ms
Depth: 2 Count: 10000000 Time: 200ms Depth: 2 Count: 10000000 Time: 234ms
Depth: 3 Count: 10000000 Time: 228ms Depth: 3 Count: 10000000 Time: 267ms
Depth: 4 Count: 10000000 Time: 245ms Depth: 4 Count: 10000000 Time: 303ms
Depth: 5 Count: 10000000 Time: 267ms Depth: 5 Count: 10000000 Time: 335ms
Depth: 6 Count: 10000000 Time: 289ms Depth: 6 Count: 10000000 Time: 364ms
Depth: 7 Count: 10000000 Time: 312ms Depth: 7 Count: 10000000 Time: 397ms
Depth: 8 Count: 10000000 Time: 326ms Depth: 8 Count: 10000000 Time: 432ms
Depth: 9 Count: 10000000 Time: 366ms Depth: 9 Count: 10000000 Time: 462ms
Depth: 10 Count: 10000000 Time: 375ms Depth: 10 Count: 10000000 Time: 493ms
According to these results there is a significant performance deterioration, which points to a conclusion that you should avoid chaining where clause in Linq to Objects.
Or There Is something I am missing?
Filter collections using Where clause in C#. A single query expression may have multiple where clauses.
The term "LINQ to Objects" refers to the use of LINQ queries with any IEnumerable or IEnumerable<T> collection directly, without the use of an intermediate LINQ provider or API such as LINQ to SQL or LINQ to XML. You can use LINQ to query any enumerable collections such as List<T>, Array, or Dictionary<TKey,TValue>.
In a nutshell, LINQ to Objects provides the developer with the means to conduct queries against an in-memory collection of objects. The techniques used to query against such collections of objects are similar to but simpler than the approaches used to conduct queries against a relational database using SQL statements.
Yes, it would be good advice to try to avoid chaining .Where()
clauses on top of one another when it's a no-brainer and you have the opportunity, like in this microbenchmark.
Note that .NET's LINQ-to-Objects implementation is smart enough to combine the predicates for you when you do this. It's not as bad as it could have been, but running the fancy chained delegate is not going to be as fast or as elegant as running a single delegate with all the logic inside of it.
However, suppose you have an arbitrary instance of an IEnumerable<T>
, which may or may not be the result of some .Where()
method already, and you're writing a method that needs to filter it on some predicate.
Based on the results of this microbenchmark, are you really going to refactor your entire code base so that you can "avoid chaining where clause in Linq to objects", or are you just going to add one more .Where()
and move on with your life?
As always, if you run actual performance testing on an application that has performance problems (meaning, performance is outside of the range defined as "acceptable"), and the results indicate that chaining .Where()
clauses is a bottleneck, then you might be more justified to try and rethink what's going on.
Also, out of curiosity, I changed the duplicated "3" clauses to "4" clauses in RunTestAnd, ran your code on my 4-core Windows 8.1 x64 machine in .NET 4.5.1 (release mode, no debugger), then ran it after changing the line that initializes input
to:
var input = Enumerable.Repeat("value", size).AsParallel();
with these results (output massaged for prettiness, but I vouch that these were the actual numbers):
chain clauses (parallel) chain clauses (serial)
Depth: 1 Count: 10000000 Time: 284ms Depth: 1 Count: 10000000 Time: 185ms
Depth: 2 Count: 10000000 Time: 241ms Depth: 2 Count: 10000000 Time: 248ms
Depth: 3 Count: 10000000 Time: 267ms Depth: 3 Count: 10000000 Time: 308ms
Depth: 4 Count: 10000000 Time: 256ms Depth: 4 Count: 10000000 Time: 370ms
Depth: 5 Count: 10000000 Time: 371ms Depth: 5 Count: 10000000 Time: 432ms
Depth: 6 Count: 10000000 Time: 345ms Depth: 6 Count: 10000000 Time: 667ms
Depth: 7 Count: 10000000 Time: 342ms Depth: 7 Count: 10000000 Time: 569ms
Depth: 8 Count: 10000000 Time: 465ms Depth: 8 Count: 10000000 Time: 627ms
Depth: 9 Count: 10000000 Time: 434ms Depth: 9 Count: 10000000 Time: 862ms
Depth: 10 Count: 10000000 Time: 416ms Depth: 10 Count: 10000000 Time: 1235ms
use and (parallel) use and (serial)
Depth: 1 Count: 10000000 Time: 263ms Depth: 1 Count: 10000000 Time: 182ms
Depth: 2 Count: 10000000 Time: 265ms Depth: 2 Count: 10000000 Time: 217ms
Depth: 3 Count: 10000000 Time: 239ms Depth: 3 Count: 10000000 Time: 228ms
Depth: 4 Count: 10000000 Time: 255ms Depth: 4 Count: 10000000 Time: 255ms
Depth: 5 Count: 10000000 Time: 272ms Depth: 5 Count: 10000000 Time: 275ms
Depth: 6 Count: 10000000 Time: 255ms Depth: 6 Count: 10000000 Time: 295ms
Depth: 7 Count: 10000000 Time: 268ms Depth: 7 Count: 10000000 Time: 320ms
Depth: 8 Count: 10000000 Time: 268ms Depth: 8 Count: 10000000 Time: 339ms
Depth: 9 Count: 10000000 Time: 305ms Depth: 9 Count: 10000000 Time: 363ms
Depth: 10 Count: 10000000 Time: 267ms Depth: 10 Count: 10000000 Time: 386ms
What this suggests to me is that if you do find that your 10-deep chain of .Where()
clauses is a bottleneck, there's no easy straight-forward refactoring, and your specific usage scenario allows it, consider trying PLINQ.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With