I'm interesting in how CLR implementes the calls like this: <pre class="prettyprint"><code>abstract class A { public abstract void Foo<T, U, V>(); } A a = ... a.Foo<int, string, decimal>(); // <=== ? </code></pre> Is this call cause an some kind of hash map lookup by type parameters tokens as the keys and compiled generic method specialization (one for all reference types and the different code for all the value types) as the values?

I didn't find much exact information about this, so much of this answer is based on the excellent paper on .Net generics from 2001 (even before .Net 1.0 came out!), one short note in a follow-up paper and what I gathered from SSCLI v. 2.0 source code (even though I wasn't able to find the exact code for calling virtual generic methods). Let's start simple: how is a non-generic non-virtual method called? By directly calling the method code, so the compiled code contains direct address. The compiler gets the method address from the method table (see next paragraph). Can it be that simple? Well, almost. The fact that methods are JITed makes it a little more complicated: what is actually called is either code that compiles the method and only then executes it, if it wasn't compiled yet; or it's one instruction that directly calls the compiled code, if it already exists. I'm going to ignore this detail further on. Now, how is a non-generic virtual method called? Similar to polymorphism in languages like C++, there is a method table accessible from the <code>this</code> pointer (reference). Each derived class has its own method table and its methods there. So, to call a virtual method, get the reference to <code>this</code> (passed in as a parameter), from there, get the reference to the method table, look at the correct entry in it (the entry number is constant for specific function) and call the code the entry points to. Calling methods through interfaces is slightly more complicated, but not interesting for us now. Now we need to know about code sharing. Code can be shared between two “instances” of the same method, if reference types in type parameters correspond to any other reference types, and value types are exactly the same. So, for example <code>C<string>.M<int>()</code> shares code with <code>C<object>.M<int>()</code>, but not with <code>C<string>.M<byte>()</code>. There is no difference between type type parameters and method type parameters. (The original paper from 2001 mentions that code can be shared also when both parameters are <code>struct</code>s with the same layout, but I'm not sure this is true in the actual implementation.) Let's make an intermediate step on our way to generic methods: non-generic methods in generic types. Because of code sharing, we need to get the type parameters from somewhere (e.g. for calling code like <code>new T[]</code>). For this reason, each instantiation of generic type (e.g. <code>C<string></code> and <code>C<object></code>) has its own type handle, which contains the type parameters and also method table. Ordinary methods can access this type handle (technically a structure confusingly called <code>MethodTable</code>, even though it contains more than just the method table) from the <code>this</code> reference. There are two types of methods that can't do that: static methods and methods on value types. For those, the type handle is passed in as a hidden argument. For non-virtual generic methods, the type handle is not enough and so they get different hidden argument, <code>MethodDesc</code>, that contains the type parameters. Also, the compiler can't store the instantiations in the ordinary method table, because that's static. So it creates a second, different method table for generic methods, which is indexed by type parameters, and gets the method address from there, if it already exists with compatible type parameters, or creates a new entry. Virtual generic methods are now simple: the compiler doesn't know the concrete type, so it has to use the method table at runtime. And the normal method table can't be used, so it has to look in the special method table for generic methods. Of course, the hidden parameter containing type parameters is still present. One interesting tidbit learned while researching this: because the JITer is very lazy, the following (completely useless) code works: <pre class="prettyprint"><code>object Lift<T>(int count) where T : new() { if (count == 0) return new T(); return Lift<List<T>>(count - 1); } </code></pre> The equivalent C++ code causes the compiler to give up with a stack overflow.

how virtual generic method call is implemented?

Tags:

c#

.net

clr

I'm interesting in how CLR implementes the calls like this:

abstract class A {
    public abstract void Foo<T, U, V>();
}

A a = ...
a.Foo<int, string, decimal>(); // <=== ?

Is this call cause an some kind of hash map lookup by type parameters tokens as the keys and compiled generic method specialization (one for all reference types and the different code for all the value types) as the values?

729

asked Jul 04 '11 15:07

controlflow

2 Answers

I didn't find much exact information about this, so much of this answer is based on the excellent paper on .Net generics from 2001 (even before .Net 1.0 came out!), one short note in a follow-up paper and what I gathered from SSCLI v. 2.0 source code (even though I wasn't able to find the exact code for calling virtual generic methods).

Let's start simple: how is a non-generic non-virtual method called? By directly calling the method code, so the compiled code contains direct address. The compiler gets the method address from the method table (see next paragraph). Can it be that simple? Well, almost. The fact that methods are JITed makes it a little more complicated: what is actually called is either code that compiles the method and only then executes it, if it wasn't compiled yet; or it's one instruction that directly calls the compiled code, if it already exists. I'm going to ignore this detail further on.

Now, how is a non-generic virtual method called? Similar to polymorphism in languages like C++, there is a method table accessible from the this pointer (reference). Each derived class has its own method table and its methods there. So, to call a virtual method, get the reference to this (passed in as a parameter), from there, get the reference to the method table, look at the correct entry in it (the entry number is constant for specific function) and call the code the entry points to. Calling methods through interfaces is slightly more complicated, but not interesting for us now.

Now we need to know about code sharing. Code can be shared between two “instances” of the same method, if reference types in type parameters correspond to any other reference types, and value types are exactly the same. So, for example C<string>.M<int>() shares code with C<object>.M<int>(), but not with C<string>.M<byte>(). There is no difference between type type parameters and method type parameters. (The original paper from 2001 mentions that code can be shared also when both parameters are structs with the same layout, but I'm not sure this is true in the actual implementation.)

Let's make an intermediate step on our way to generic methods: non-generic methods in generic types. Because of code sharing, we need to get the type parameters from somewhere (e.g. for calling code like new T[]). For this reason, each instantiation of generic type (e.g. C<string> and C<object>) has its own type handle, which contains the type parameters and also method table. Ordinary methods can access this type handle (technically a structure confusingly called MethodTable, even though it contains more than just the method table) from the this reference. There are two types of methods that can't do that: static methods and methods on value types. For those, the type handle is passed in as a hidden argument.

For non-virtual generic methods, the type handle is not enough and so they get different hidden argument, MethodDesc, that contains the type parameters. Also, the compiler can't store the instantiations in the ordinary method table, because that's static. So it creates a second, different method table for generic methods, which is indexed by type parameters, and gets the method address from there, if it already exists with compatible type parameters, or creates a new entry.

Virtual generic methods are now simple: the compiler doesn't know the concrete type, so it has to use the method table at runtime. And the normal method table can't be used, so it has to look in the special method table for generic methods. Of course, the hidden parameter containing type parameters is still present.

One interesting tidbit learned while researching this: because the JITer is very lazy, the following (completely useless) code works:

object Lift<T>(int count) where T : new()
{
    if (count == 0)
        return new T();

    return Lift<List<T>>(count - 1);
}

The equivalent C++ code causes the compiler to give up with a stack overflow.

145

answered Oct 05 '22 23:10

svick

Yes. The code for specific type is generated at the runtime by CLR and keeps a hashtable (or similar) of implementations.

Page 372 of CLR via C#:

When a method that uses generic type parameters is JIT-compiled, the CLR takes the method's IL, substitutes the specified type arguments, and then creates native code that is specific to that method operating on the specified data types. This is exactly what you want and is one of the main features of generics. However, there is a downside to this: the CLR keeps generating native code for every method/type combination. This is referred to as code explosion. This can end up increasing the application's working set substantially, thereby hurting performance. Fortunately, the CLR has some optimizations built into it to reduce code explosion. First, if a method is called for a particular type argument, and later, the method is called again using the same type argument, the CLR will compile the code for this method/type combination just once. So if one assembly uses List, and a completely different assembly (loaded in the same AppDomain) also uses List, the CLR will compile the methods for List just once. This reduces code explosion substantially.

answered Oct 06 '22 01:10

Aliostad

Related questions
                            
                                Can I write PowerShell binary cmdlet with .NET Core?
                            
                                The *deps.json file in .NET Core
                            
                                TryValidateModel in asp.net core throws Null Reference Exception while performing unit test
                            
                                How to start HostedService in MVC Core app without http request
                            
                                Transactionscope throwing exception this platform does not support distributed transactions while opening connection object
                            
                                How to use GroupBy in an asynchronous manner in EF Core 3.1?
                            
                                In .Net, when if ever should I pass structs by reference for performance reasons?
                            
                                C#: Accessing Inherited Private Instance Members Through Reflection
                            
                                Can C# Provide a static_assert?
                            
                                Any .NET ecommerce packages using MVC and Linq?
                            
                                WPF WrapPanel - all items should have the same width
                            
                                How do I gracefully handle hibernate/sleep modes in a winforms application?
                            
                                Is there a standalone visual editor for WPF XAML files? [closed]
                            
                                Entity Framework: Alternate solution to using non primary unique keys in an association
                            
                                Linq query with multiple Contains/Any for RavenDB
                            
                                DataContext Accessed After Dispose
                            
                                AvalonEdit: highlight current line even when not focused
                            
                                Open source torrent client in C# [closed]
                            
                                Store enum as string in database
                            
                                The underlying connection was closed: An unexpected error occurred on a receive

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With