Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ambiguity in parameter type inference for C# lambda expressions

My question is motivated by Eric Lippert's this blog post. Consider the following code:

using System;
class Program {
    class A {}
    class B {}
    static void M(A x, B y) { Console.WriteLine("M(A, B)"); }
    static void Call(Action<A> f) { f(new A()); }
    static void Call(Action<B> f) { f(new B()); }
    static void Main() { Call(x => Call(y => M(x, y))); }
}

This compiles successfully and prints M(A, B), because the compiler figures out that the types of x and y in the lambda expressions should be A and B respectively. Now, add an overload for Program.M:

using System;
class Program {
    class A {}
    class B {}
    static void M(A x, B y) { Console.WriteLine("M(A, B)"); }
    static void M(B x, A y) { Console.WriteLine("M(B, A)"); } // added line
    static void Call(Action<A> f) { f(new A()); }
    static void Call(Action<B> f) { f(new B()); }
    static void Main() { Call(x => Call(y => M(x, y))); }
}

This yields a compile-time error:

error CS0121: The call is ambiguous between the following methods or properties: 'Program.Call(Action<Program.A>)' and 'Program.Call(Action<Program.B>)'

The compiler cannot infer the types of x and y. It may be that x is of type A and y is of type B or vice versa, and neither can be preferred because of full symmetry. So far so good. Now, add one more overload for Program.M:

using System;
class Program {
    class A {}
    class B {}
    static void M(A x, B y) { Console.WriteLine("M(A, B)"); }
    static void M(B x, A y) { Console.WriteLine("M(B, A)"); }
    static void M(B x, B y) { Console.WriteLine("M(B, B)"); } // added line
    static void Call(Action<A> f) { f(new A()); }
    static void Call(Action<B> f) { f(new B()); }
    static void Main() { Call(x => Call(y => M(x, y))); }
}

This compiles successfully and prints M(A, B) again! I can guess the reason. The compiler resolves the overload of Program.Call trying to compile the lambda expression x => Call(y => M(x, y)) for x of type A and for x of type B. The former succeeds, while the latter fails because of ambiguity detected when trying to infer the type of y. Therefore, the compiler concludes that x must be of type A.

Thus adding more ambiguity results in less ambiguity. This is weird. Moreover, this is inconsistent with what Eric wrote in the above-mentioned post:

If it has more than one solution then compilation fails with an ambiguity error.

Is there any good reason for the current behavior? Is it just the matter of making the compiler's life easier? Or is it rather a flaw in the compiler/specification?

like image 590
Bartosz Avatar asked Oct 25 '16 21:10

Bartosz


1 Answers

Interesting scenarios. Let's consider how the compiler analyzes each.

In your first scenario the only possibility is that x is A and y is B. Everything else produces an error.

In your second scenario we could have x is A, y is B, or x is B, y is A. Either solution works, we have no basis upon which to prefer one, so the program is ambiguous.

Now consider your third scenario. Let's start with the supposition that x is B. If x is B then y could be A or B. We have no reason to prefer A or B for y. Therefore a program in which x is B is ambiguous. Therefore x cannot be B; our supposition must have been wrong.

So either x is A, or the program is erroneous. Can x be A? If it is, then y must be B. We deduce no error if x is A, and we deduce an error if x is B, so x must be A.

From that we can deduce that x is A and y is B.

This is weird.

Yep. Overload resolution is hard enough in a world without generic type inference and lambdas. With them it is really quite difficult.

I suspect your difficulty lies in what seems to be a better analysis for the third scenario:

  • x is A, y is A fails
  • x is A, y is B works
  • x is B, y is A works
  • x is B, y is B works
  • therefore there are three solutions, none is better, therefore this is ambiguous.

That's not how it works. Rather, we give every possible type assignment to the outermost lambda and try to deduce success or failure for each.

If you make the argument that it's a bit weird that "order matters" -- the lambda on the outside is in some sense "privileged" over the lambdas on the inside, well, sure, I can see that argument. That's the nature of backtracking algorithms.

If it has more than one solution then compilation fails with an ambiguity error.

That's still true. In your first and third scenarios there is one solution which can be deduced without contradiction; in the second one there are two solutions and that's ambiguous.

Is there any good reason for the current behavior?

Yes. We considered these rules extremely carefully. Like all design decisions, there was a process of compromise.

Is it just the matter of making the compiler's life easier?

HA HA HA HA HA HA HA HA HA.

It took me the better part of a year to design, specify, implement and test all this logic, and much time and effort from many of my colleagues. Easy doesn't come into any part of it.

Or is it rather a flaw in the compiler/specification?

No.

The goal of the specification process was to come up with a design that produced reasonable inferences given the sorts of overloads we saw in the LINQ standard libraries. I think we achieved that goal. "Adding an overload never causes an ambiguous program to become non-ambiguous" was not at any time a goal of the specification process.

like image 60
Eric Lippert Avatar answered Oct 20 '22 19:10

Eric Lippert