Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type inference with class implementing several interfaces of a hierarchy

Tags:

As an example, let's use something like a calculator with elements of various types, functions that evaluate for different element types, and a context to store elements and run functions. The interfaces are something like this:

public interface IElement { } public interface IChildElement : IElement {     double Score { get; } } public interface IGrandchildElement : IChildElement {     int Rank { get; } }  public interface IFunction<Tout, in Tin> where Tin : IElement {     Tout Evaluate(Tin x, Tin y); }  public interface IContext<Tin> where Tin : IElement {     Tout Evaluate<Tout>(string x, string y, IFunction<Tout, Tin> eval); } 

Note that functions may return arbitrary types. A dummy implementation is as follows, where I have a function called Foo that can be used for both IChildElement and IGrandchildElement, and returns double in both cases:

public class ChildElement : IChildElement {     public double Score { get; internal set; } } public class GrandchildElement : ChildElement, IGrandchildElement {     public int Rank { get; internal set; } }  public class Foo : IFunction<double, IChildElement>, IFunction<double, IGrandchildElement> {     public double Evaluate(IChildElement x, IChildElement y) {         return x.Score / y.Score;     }     public double Evaluate(IGrandchildElement x, IGrandchildElement y) {         return x.Score * x.Rank / y.Score / y.Rank;     } }  public class Context<T> : IContext<T> where T : IElement {     protected Dictionary<string, T> Results { get; set; }      public Context() {         this.Results = new Dictionary<string, T>();     }      public void AddElement(string key, T e) {         this.Results[key] = e;     }     public Tout Evaluate<Tout>(string x, string y, IFunction<Tout, T> eval) {         return eval.Evaluate(this.Results[x], this.Results[y]);     } } 

Some sample execution:

Context<IChildElement> cont = new Context<IChildElement>(); cont.AddElement("x", new ChildElement() { Score = 1.0 }); cont.AddElement("y", new ChildElement() { Score = 2.0 }); Foo f = new Foo(); double res1 = cont.Evaluate("x", "y", f); // This does not compile double res2 = cont.Evaluate<double>("x", "y", f); // This does 

As you can see, my problem is that I seemingly need to hard-type the call to Context.Evaluate. If I don't, the compiler says it cannot infer the type of the arguments. This is particularly striking to me since in both cases the Foo function returns double.

If Foo implements only IFunction<double, IChildElement> or IFunction<double, IGrandchildElement> I don't have this problem. But it does.

I don't understand it. I mean, adding the <double> does not differentiate between IFunction<double, IGrandchildElement> and IFunction<double, IChildElement> because they both return double. For what I understand, it doesn't provide the compiler with any additional information to distinguish.

In any case, is there any way I can avoid having to hard-type all calls to Task.Evaluate? In the real world I have several functions, so being able to avoid it would be great.

Bounty for sound explanation of why adding the <double> helps the compiler. Is this a problem with the compiler being too lazy so to speak?

Old update: using delegates

An option could be to use delegates instead of IFunctions in IContext.Evaluate:

public interface IContext<Tin> where Tin : IElement {     Tout Evaluate<Tout>(string x, string y, Func<Tin, Tin, Tout> eval); } public class Context<T> : IContext<T> where T : IElement {     // ...     public Tout Evaluate<Tout>(string x, string y, Func<T, T, Tout> eval) {         return eval(this.Results[x], this.Results[y]);     } } 

Doing so, we don't need to hard-type <double> when calling IContext.Evaluate:

Foo f = new Foo(); double res1 = cont.Evaluate("x", "y", f.Evaluate); // This does compile now double res2 = cont.Evaluate<double>("x", "y", f.Evaluate); // This still compiles 

So here the compiler does work as expected. We avoid the need to hard-type, but I don't like the fact that we use IFunction.Evaluate instead of the IFunction object itself.

like image 489
Julián Urbano Avatar asked Mar 29 '13 04:03

Julián Urbano


2 Answers

(I haven't gone through the delegates version. I figured this answer was long enough already...)

Let's start off by simplifying the code considerably. Here's a short but complete example which still demonstrates the problem, but removes everything irrelevant. I've also changed the order of the type arguments in IFunction just to match more normal conventions (e.g. Func<T, TResult>):

// We could even simplify further to only have IElement and IChildElement... public interface IElement {} public interface IChildElement : IElement {} public interface IGrandchildElement : IChildElement {}  public interface IFunction<in T, TResult> where T : IElement {     TResult Evaluate(T x); }  public class Foo : IFunction<IChildElement, double>,                    IFunction<IGrandchildElement, double> {     public double Evaluate(IChildElement x) { return 0; }     public double Evaluate(IGrandchildElement x) { return 1; } }  class Test {     static TResult Evaluate<TResult>(IFunction<IChildElement, TResult> function)     {         return function.Evaluate(null);     }      static void Main()     {         Foo f = new Foo();         double res1 = Evaluate(f);         double res2 = Evaluate<double>(f);     } } 

This still has the same problem:

Test.cs(27,23): error CS0411: The type arguments for method         'Test.Evaluate<TResult>(IFunction<IChildElement,TResult>)' cannot be         inferred from the usage. Try specifying the type arguments explicitly. 

Now, as for why it happens... the problem is type inference, as others have said. The type inference mechanism in C# (as of C# 3) is pretty good, but it's not as powerful as it could be.

Let's look at what happens at the method invocation part, with reference to the C# 5 language specification.

7.6.5.1 (method invocations) is the important part here. The first step is:

The set of candidate methods for the method invocation is constructed. For each method F associated with the method group M:

  • If F is non-generic, F is a candidate when:
    • M has no type argument list, and
    • F is applicable with respect to A (§7.5.3.1).
  • If F is generic and M has no type argument list, F is a candidate when:
    • Type inference (§7.5.2) succeeds, inferring a list of type arguments for the call, and
    • Once the inferred type arguments are substituted for the corresponding method type parameters, all constructed types in the parameter list of F satisfy their constraints (§4.4.4), and the parameter list of F is applicable with respect to A (§7.5.3.1).
  • If F is generic and M includes a type argument list, F is a candidate when:
    • F has the same number of method type parameters as were supplied in the type argument list, and
    • Once the type arguments are substituted for the corresponding method type parameters, all constructed types in the parameter list of F satisfy their constraints (§4.4.4), and the parameter list of F is applicable with respect to A (§7.5.3.1).

Now here, the method group M is a set with a single method (Test.Evaluate) - fortunately section 7.4 (member lookup) is simple. So we only have a single F method to consider.

It is generic, and M has no type argument list, so we end up straight in section 7.5.2 - type inference. Notice how if there is an argument list, this is skipped entirely, and the third major bullet point above is satisfied - which is why the Evaluate<double>(f) call succeeds.

So, we've got a pretty good indication by now that the problem lies in type inference. Let's dive into it. (This is where it gets tricky, I'm afraid.)

7.5.2 itself is mostly just description, including the fact that type inference happens in phases.

The generic method we're trying to call is described as:

Tr M<X1...Xn>(T1 x1 ... Tm xm) 

and the method call is described as:

M(E1 ... Em) 

So in our case, we have:

  • Tr is TResult which is the same as X1.
  • T1 is IFunction<IChildElement, TResult>
  • x1 is function, a value parameter
  • E1 is f, which is of type Foo

Now let's try to apply that for the rest of type inference...

7.5.2.1 The first phase
For each of the method arguments Ei:

  • If Ei is an anonymous function, an explicit parameter type inference (§7.5.2.7) is made from Ei to Ti
  • Otherwise, if Ei has a type U and xi is a value parameter then a lower-bound inference is made from U to Ti.
  • Otherwise, if Ei has a type U and xi is a ref or out parameter then an exact inference is made from U to Ti.
  • Otherwise, no inference is made for this argument.

The second bullet point is relevant here: E1 is not an anonymous function, E1 has a type Foo, and x1 is a value parameter. So we end up with a lower-bound inference from Foo to T1. That lower-bound inference is described in 7.5.2.9. The important part here is:

Otherwise, sets U1...Uk and V1...Vk are determined by checking if any of the following cases apply:

  • [...]
  • V is a constructed class, struct, interface or delegate type C<V1...Vk> and there is a unique type C<U1...Uk> such that U (or, if U is a type parameter, its effective base class or any member of its effective interface set) is identical to, inherits from (directly or indirectly), or implements (directly or indirectly) C<U1...Uk>. (The "uniqueness" restriction means that in the case interface C<T>{} class U: C<X>, C<Y>{}, then no inference is made when inferring from U to C<T> because U1 could be X or Y.)

For the purposes of this part, U is Foo, and V is IFunction<IChildElement, TResult>. However, Foo implements both IFunction<IChildElement, double> and IFunction<IGrandchildelement, double>. So even though in both cases we'd end up with U2 as double, this clause isn't satisfied.

One thing which does surprise me in this is that this doesn't rely on the T in IFunction<in T, TResult> being contravariant. We get the same issue if we remove the in part. I would have expected it to work in that case, as there wouldn't be a conversion from IFunction<IGrandchildElement, TResult> to IFunction<IChildElement, TResult>. It's possible that that part is a compiler bug, but it's more likely to be me misreading the spec. However, in the case that's actually given, that's irrelevant - because of the contravariance of T, there is such a conversion, so both interfaces really are significant.

Anyway, that means we don't actually end up with any type inference from this argument!

That's the whole of the first phase.

The second phase is described like this:

7.5.2.2 The second phase

The second phase proceeds as follows:

  • All unfixed type variables Xi which do not depend on (§7.5.2.5) any Xj are fixed (§7.5.2.10).
  • If no such type variables exist, all unfixed type variables Xi are fixed for which all of the following hold:
    • There is at least one type variable Xj that depends on Xi
    • Xi has a non-empty set of bounds
  • If no such type variables exist and there are still unfixed type variables, type inference fails.
  • Otherwise, if no further unfixed type variables exist, type inference succeeds.
  • Otherwise, for all arguments Ei with corresponding parameter type Ti where the output types (§7.5.2.4) contain unfixed type variables Xj but the input types (§7.5.2.3) do not, an output type inference (§7.5.2.6) is made from Ei to Ti. Then the second phase is repeated.

I'm not going to copy out all the sub-clauses, but in our case...

  • Type variable X1 doesn't depend on any other type variables, as there aren't any other type variables. So we need to fix X1. (The section reference here is wrong - it should actually be 7.5.2.11. I'll let Mads know.)

We have no bounds for X1 (because the lower-bound inference earlier on didn't help) so we end up failing type inference at this point. Bang. It all hinges around that uniqueness part in 7.5.2.9.

Of course, this could be fixed. The type inference part of the specification could be made more powerful - the trouble is that that would also make it more complicated, resulting in:

  • It being harder for developers to reason about type inference (it's hard enough as it is!)
  • It being harder to specify correctly with no gaps
  • It being harder to implement correctly
  • Quite possibly, it having worse performance at compile-time (which can be an issue in interactive editors such as Visual Studio, which need to perform the same type inference for things like Intellisense to work)

It's all a balancing act. I think the C# team have done pretty well - the fact that it doesn't work in corner cases such as this isn't too much of a problem, IMO.

like image 69
Jon Skeet Avatar answered Oct 04 '22 11:10

Jon Skeet


Type inference fails because the compiler is not able to fix the type parameter to a unique mapping.

There is an ambiguity between IFunction<double, IChildElement> and IFunction<double, IGrandChildElement> because they are both bindable.

When type inference fail you must explicitly specify your type arguments. This is as per the C# language specification.

By specifying the explicit type argument you help the compiler since it can skip type inference altogether.

After you have explicitly specified that T is bound to double there is no longer an ambiguity since Tin is bound to IChildElement via your declaration of Context<IChildElement> and Tout is bound to double via the explicit type argument.

I agree that you may argue that the compiler might as well have inferred that usage since the type argument in this case doesn't really provide any additional information.

However, the specification says:

Type inference occurs as part of the binding-time processing of a method invocation (§7.6.5.1) and takes place before the overload resolution step of the invocation

...so I guess they wanted to separate these things. The reason for that is beyond me. I'm guessing it could have been simplicity of the specification or support for future extensions, or just that they didn't think of it :-)

like image 43
Mårten Wikström Avatar answered Oct 04 '22 12:10

Mårten Wikström