Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't Mono support generic interface instantiation with AOT?

The Mono documentation has a code example about full AOT not supporting generic interface instantiation:

interface IFoo<T> {
... 
    void SomeMethod();
}

It says:

Since Mono has no way of determining from the static analysis what method will implement the interface IFoo<int>.SomeMethod this particular pattern is not supported."

So I think the compiler can't work with this method under type inference. But I still can't understand the underlying reason about the full AOT limitation.

There is still a similar problem with the Unity AOT script restrictions. In the following code:

using UnityEngine;
using System;

public class AOTProblemExample : MonoBehaviour, IReceiver 
{
    public enum AnyEnum {
        Zero,
        One,
    }

    void Start() {
        // Subtle trigger: The type of manager *must* be
        // IManager, not Manager, to trigger the AOT problem.
        IManager manager = new Manager();
        manager.SendMessage(this, AnyEnum.Zero);
    }

    public void OnMessage<T>(T value) {
        Debug.LogFormat("Message value: {0}", value);
    }
}

public class Manager : IManager {
    public void SendMessage<T>(IReceiver target, T value) {
        target.OnMessage(value);
    }
}

public interface IReceiver {
    void OnMessage<T>(T value);
}

public interface IManager {
    void SendMessage<T>(IReceiver target, T value);
}

I am confused by this:

The AOT compiler does not realize that it should generate code for the generic method OnMessage with a T of AnyEnum, so it blissfully continues, skipping this method. When that method is called, and the runtime can’t find the proper code to execute, it gives up with this error message.

Why does the AOT not know the type when the JIT can infer the type? Can anyone offer a detailed answer?

like image 477
user4568159 Avatar asked May 22 '16 15:05

user4568159


Video Answer


1 Answers

Before describing the issues, consider this excerpt from another answer of mine that describes the generics situation on platforms that do support dynamic code generation:

In C# generics, the generic type definition is maintained in memory at runtime. Whenever a new concrete type is required, the runtime environment combines the generic type definition and the type arguments and creates the new type (reification). So we get a new type for each combination of the type arguments, at runtime.

The phrase at runtime is key to this, because it brings us to another point:

This implementation technique depends heavily on runtime support and JIT-compilation (which is why you often hear that C# generics have some limitations on platforms like iOS, where dynamic code generation is restricted).

So is it possible for a full AOT compiler to do that as well? It most certainly is possible. But is it easy?

There is a paper from Microsoft Research on pre-compiling .NET generics that describes the interaction of generics with AOT compilation, highlights some potential problems and proposes solutions. In this answer, I will use that paper to try to demonstrate why .NET generics aren't widely pre-compiled (yet).

Everything must be instantiated

Consider your example :

IManager manager = new Manager();
manager.SendMessage(this, AnyEnum.Zero);

Clearly we're calling the method IManager.SendMessage<AnyEnum> here, so the fully AOT compiler needs to compile that method.

But this is an interface call, which is effectively a virtual call, which means the we can't know ahead of time which implementation of the interface method will be called.

The JIT compiler doesn't care about this problem. When someone attempts to run a method that hasn't been compiled yet, the JIT will be notified and it will compile the method lazily.

On the contrary, a fully AOT compiler doesn't have access to all this runtime type information. So it has to pessimistically compile all possible instantiations of the generic method on all implementations of the interface. (Or just give up and not offer that feature.)

Generics can be infinitely recursive

object M<T>(long n)
{
    if (n == 1)
    {
        return new T[]();
    }
    else 
    {
        return M<T[]>(n - 1);
    }
}

To instantiate M<int>(), the compiler needs to instantiate int[] and M<int[]>(). To instantiate M<int[]>(), the compiler needs to instantiate int[][] and M<int[][]>(). To instantiate M<int[][]>(), the compiler needs to instantiate int[][][] and M<int[][][]>().

This can be solved by using representative instantiations (just like the JIT compiler uses). This means that all generic arguments that are reference types can share their code. So:

  • int[][], int[][][], int[][][][] (and so on) can all share the same code, because they are arrays of references.
  • M<int[]>, M<int[][]>, M<int[][][]> (and so on) can all share the same code, because they operate on references.

Assemblies need to own their generics...

Since C# programs are compiled in assemblies, it's hard to tell exactly who should "own" which instantiation of each type.

  • Assembly1 declares the type G<T>.
  • Assembly2 (references Assembly1) instantiates the type G<int>.
  • Assembly3 (references Assembly1) instantiates the type G<int> as well.
  • AssemblyX (references all the above) wants to use G<int>.

Which assembly gets to compile the actual G<int>? If they happen to be standalone libraries, neither Assembly2 nor Assembly3 can be compiled without each owning a copy of G<int>. So we're already looking at duplicated native code.

...and those generics must still be compatible with each other

But then, when AssemblyX is compiled, which copy of G<int> should it use? Clearly, it has to be able to handle both, because it may need to receive a G<int> from or send a G<int> to either assembly.

But more importantly, in C# you can't have two types with identical fully qualified names that turn out to be incompatible. In other words:

G<int> obj = new G<int>();

The above can never fail on the grounds that G<int> (the variable's type) is the G<int> from Assembly2 while G<int> (the constructor's type) is the G<int> from Assembly3. If it fails for a reason like that, we're not in C# anymore!

So both types need to exist and they need to be made transparently compatible, even though they are compiled separately. For this to happen, the type handles need to be manipulated at link time in such a way that the semantics of the language are retained, including the fact that they should be assignable to each other, their type handles should compare as equal (e.g. when using typeof), and so on.

like image 160
Theodoros Chatzigiannakis Avatar answered Sep 30 '22 09:09

Theodoros Chatzigiannakis