Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Closure over variable gives slightly worse performance". How?

Tags:

c#

While giving answer to an SO question, I was told that my solution will introduce a closure over variable so it will have slightly worse performance. So my question is:

  1. How will there be a closure?
  2. How will it affect performance?

Here is the question

List.Where(s => s.ValidDate.Date == DateTime.Today.Year).ToList();

Here is my solution. I introduced the variable yr to store year.

int yr = DateTime.Now.Year;
List.Where(s => s.ValidDate.Year == yr).ToList();

Here it is in the answer's comments

like image 778
Nikhil Agrawal Avatar asked Jul 12 '14 08:07

Nikhil Agrawal


3 Answers

First of all, those two solutions are not functionally equivalent (If you fix comparing a date with an int (.Date == .Today.Year)):

  • The first snippet re-evaluates DateTime.Today.Year for each value of the list, which can give different results when the current year changes during iteration

  • The second snippet stores the current year and re-uses that, so all items in the resulting list will have the same year. (I'd personally take this approach, as I want to make sure the result is sane).

The closure is introduced because the lambda accesses a variable from its outer scope, it closes over the value of yr. The C# compile will generate a new class with a field which holds the yr. All references to yr will be replaced with the new field and the original yr will not even exist in the compiled code

I doubt there will be a performance penalty by introducing a closure. If any, the code using the closure will be faster, since it does not have to create new DateTime instances for every list item and then dereference two properties. It only has to access the field of the compiler-generated closure class which holds the int value of the current year. (Anybody who wants to compare the generated IL code or profile the two snippets? :))

like image 175
knittl Avatar answered Nov 15 '22 18:11

knittl


In addition to knittl's answer I wanted to attempt and measure the performance with and without a closure, here is what my test looks like:

internal class SomeData {
    public DateTime ValidDate { get; set; }
    // other data ...
}

class Program {
    static void Main(string[] args) {
        var stopWatch = new Stopwatch();

        // Test with closure
        IEnumerable<SomeData> data1 = CreateTestData(100000);
        stopWatch.Start();
        int yr = DateTime.Now.Year;
        List<SomeData> results1 = data1.Where(x => x.ValidDate.Year == yr).ToList();
        stopWatch.Stop();
        Console.WriteLine("With a closure - {0} ms", stopWatch.Elapsed.Milliseconds);
        // ### Output on my machine (consistently): With a closure - 16 ms

        stopWatch.Reset();

        // Test without a closure            
        IEnumerable<SomeData> data2 = CreateTestData(100000);
        stopWatch.Start();
        List<SomeData> results2 = data2.Where(x => x.ValidDate.Year == DateTime.Today.Year).ToList();
        stopWatch.Stop();
        Console.WriteLine("Without a closure - {0} ms", stopWatch.Elapsed.Milliseconds);
        // ### Output on my machine: Without a closure - 33 ms
    }

    private static IEnumerable<SomeData> CreateTestData(int numberOfItems) {
        var dt = DateTime.Today;
        for (int i = 0; i < numberOfItems; i++) {
            yield return new SomeData {ValidDate = dt};
        }
    }
}

Bottom line from my tests - as I expected the version with the closure is considerably faster.

like image 38
Dimitar Dimitrov Avatar answered Nov 15 '22 16:11

Dimitar Dimitrov


Here's a naive time measurement, merely to complement knittl's answer.

The result is that the version that evaluates DateTime.Now every time is more than 10 times slower than your code.

Results on my machine: T1: 8878 ms; T2: 589 ms. (Maximum optimization, no debugger, etc).

class Program
{
    static void Main(string[] args)
    {
        var things = new List<Something>();
        var random = new Random(111);
        for (int i = 0; i < 100000; ++i)
        {
            things.Add(new Something(random.Next(2010, 2016)));
        }

        // to avoid measuring the JIT compilation and optimization time
        T1(things);
        T2(things);

        var sw = Stopwatch.StartNew();
        for (int i = 0; i < 100; ++i)
        {
            T1(things);
        }
        Console.WriteLine(sw.ElapsedMilliseconds);
        sw.Restart();
        for (int i = 0; i < 100; ++i)
        {
            T2(things);
        }
        Console.WriteLine(sw.ElapsedMilliseconds);

        Console.ReadLine();
    }

    private static void T1(List<Something> list)
    {
        var result = list.Where(x => x.ValidDate.Year == DateTime.Now.Year).ToList();
    }

    private static void T2(List<Something> list)
    {
        var yr = DateTime.Now.Year;
        var result = list.Where(x => x.ValidDate.Year == yr).ToList();
    }
}

class Something
{
    public Something(int year)
    {
        this.ValidDate = new DateTime(year, 1, 1);
    }

    public DateTime ValidDate { get; private set; }
}
like image 38
dialer Avatar answered Nov 15 '22 18:11

dialer