Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

foreach loop List performance difference

While working on a project I ran into the following piece of code, which raised a performance flag.

foreach (var sample in List.Where(x => !x.Value.Equals("Not Reviewed")))
{
    //do other work here
    count++;
}

I decided to run a couple of quick tests comparing the original loop to the following loop:

foreach (var sample in List)
{
    if (!sample.Value.Equals("Not Reviewed"))
    {
        //do other work here
        count++;
    }
}

and threw this loop in too to see what happens:

var tempList = List.Where(x => !x.Value.Equals("Not Reviewed"));
foreach (var sample in tempList)
{
    //do other work here
    count++;
}

I also populated the original list 3 different ways: 50-50 (so 50% of values where "Not Reviewed" and the rest other), 10-90 and 90-10. These are my results, the first and last loops are mostly the same but the second one is much faster, especially on 10-90 case. Why exactly? I always thought Lambda had good performance.

EDIT

The count++ is not actually what's inside the loop, I just added that here for demonstration purposes, I guess I should've used "//do something here"

Performance Results

EDIT 2

Results running each one 1000 times: Performance Results 1000 times

like image 399
SOfanatic Avatar asked Jul 18 '13 15:07

SOfanatic


1 Answers

Basically, there's a small amount of extra indirection - both for the test via a delegate, and for the iterating part. Given just how little work is being done per iteration, that extra indirection is relatively expensive.

That's neither surprising nor worrying, in my view. It's the kind of micro-optimization you can easily perform if you're in the rare situation of it being significant in your real-world application. In my experience it's pretty rare for this sort of loop to be a significant bottleneck in the app. The normal approach should be:

  • Define performance requirements
  • Implement the functional requirements in the clearest, simplest way you can
  • Measure your performance against the requirements
  • If performance is found wanting, investigate why and only move away from clarity as little as you can, getting the biggest "bang for buck" that you can
  • Repeat until you're done

Responding to an edit:

The count++ is not actually what's inside the loop, I just added that here for demonstration purposes, I guess I should've used "//do something here"

Well that's the important bit - the more work that is done there, the less significant anything else will be. Just counting is pretty darned fast, so I'd expect to see a large discrepancy. Do any amount of real work, and the difference will be smaller.

like image 140
Jon Skeet Avatar answered Oct 06 '22 06:10

Jon Skeet