Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are lambda expressions not "interned"?

Strings are reference types, but they are immutable. This allows for them to be interned by the compiler; everywhere the same string literal appears, the same object may be referenced.

Delegates are also immutable reference types. (Adding a method to a multicast delegate using the += operator constitutes assignment; that's not mutability.) And, like, strings, there is a "literal" way to represent a delegate in code, using a lambda expression, e.g.:

Func<int> func = () => 5;

The right-hand side of that statement is an expression whose type is Func<int>; but nowhere am I explicitly invoking the Func<int> constructor (nor is an implicit conversion happening). So I view this as essentially a literal. Am I mistaken about my definition of "literal" here?

Regardless, here's my question. If I have two variables for, say, the Func<int> type, and I assign identical lambda expressions to both:

Func<int> x = () => 5;
Func<int> y = () => 5;

...what's preventing the compiler from treating these as the same Func<int> object?

I ask because section 6.5.1 of the C# 4.0 language specification clearly states:

Conversions of semantically identical anonymous functions with the same (possibly empty) set of captured outer variable instances to the same delegate types are permitted (but not required) to return the same delegate instance. The term semantically identical is used here to mean that execution of the anonymous functions will, in all cases, produce the same effects given the same arguments.

This surprised me when I read it; if this behavior is explicitly allowed, I would have expected for it to be implemented. But it appears not to be. This has in fact gotten a lot of developers into trouble, esp. when lambda expressions have been used to attach event handlers successfully without being able to remove them. For example:

class EventSender
{
    public event EventHandler Event;
    public void Send()
    {
        EventHandler handler = this.Event;
        if (handler != null) { handler(this, EventArgs.Empty); }
    }
}

class Program
{
    static string _message = "Hello, world!";

    static void Main()
    {
        var sender = new EventSender();
        sender.Event += (obj, args) => Console.WriteLine(_message);
        sender.Send();

        // Unless I'm mistaken, this lambda expression is semantically identical
        // to the one above. However, the handler is not removed, indicating
        // that a different delegate instance is constructed.
        sender.Event -= (obj, args) => Console.WriteLine(_message);

        // This prints "Hello, world!" again.
        sender.Send();
    }
}

Is there any reason why this behavior—one delegate instance for semantically identical anonymous methods—is not implemented?

like image 541
Dan Tao Avatar asked Jan 26 '11 17:01

Dan Tao


1 Answers

You're mistaken to call it a literal, IMO. It's just an expression which is convertible to a delegate type.

Now as for the "interning" part - some lambda expressions are cached , in that for one single lambda expression, sometimes a single instance can be created and reused however often that line of code is encountered. Some are not treated that way: it usually depends on whether the lambda expression captures any non-static variables (whether that's via "this" or local to the method).

Here's an example of this caching:

using System;

class Program
{
    static void Main()
    {
        Action first = GetFirstAction();
        first -= GetFirstAction();
        Console.WriteLine(first == null); // Prints True

        Action second = GetSecondAction();
        second -= GetSecondAction();
        Console.WriteLine(second == null); // Prints False
    }

    static Action GetFirstAction()
    {
        return () => Console.WriteLine("First");
    }

    static Action GetSecondAction()
    {
        int i = 0;
        return () => Console.WriteLine("Second " + i);
    }
}

In this case we can see that the first action was cached (or at least, two equal delegates were produced, and in fact Reflector shows that it really is cached in a static field). The second action created two unequal instances of Action for the two calls to GetSecondAction, which is why "second" is non-null at the end.

Interning lambdas which appear in different places in the code but with the same source code is a different matter. I suspect it would be quite complex to do this properly (after all, the same source code can mean different things in different places) and I would certainly not want to rely on it taking place. If it's not going to be worth relying on, and it's a lot of work to get right for the compiler team, I don't think it's the best way they could be spending their time.

like image 156
Jon Skeet Avatar answered Oct 09 '22 22:10

Jon Skeet