Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do C#/.Net generics know their parameter types?

Tags:

c#

.net

generics

In C# a generic function or class is aware of the types of its generic parameters. This means that dynamic type information, like is or as is available (in contrast to Java where it is not).

I'm curious, how does the compiler provides this type information to the generic methods? For classes I can image the instances can simply have a pointer to the type, but for generic functions I'm not sure, perhaps just a hidden parameter?

If the generics are preserved into the IL level, which I believe they are, then I'd like to know how this is done at that level.

like image 612
edA-qa mort-ora-y Avatar asked Mar 05 '15 05:03

edA-qa mort-ora-y


People also ask

How do you write a do while loop in C?

Syntax. do { statement(s); } while( condition ); Notice that the conditional expression appears at the end of the loop, so the statement(s) in the loop executes once before the condition is tested. If the condition is true, the flow of control jumps back up to do, and the statement(s) in the loop executes again.

What does %c mean in C?

%d is used to print decimal(integer) number ,while %c is used to print character . If you try to print a character with %d format the computer will print the ASCII code of the character.


2 Answers

Since you've edited your question to extend it beyond the C# compiler to the JIT compiler, here's an overview of the process, taking List<T> as our example.

As we've established, there is only one IL representation of the List<T> class. This representation has a type parameter corresponding to the T type parameter seen in C# code. As Holger Thiemann says in his comment, when you use the List<> class with a given type argument, the JIT compiler creates a native-code representation of the class for that type argument.

However, for reference types, it compiles the native code only once and reuses it for all other reference types. This is possible because, in the virtual execution system (VES, commonly called the "runtime"), there is only one reference type, called O in the spec (see paragraph I.12.1, table I.6, in the standard: http://www.ecma-international.org/publications/standards/Ecma-335.htm). This type is defined as a "native size object reference to managed memory."

In other words, all objects in the (virtual) evaluation stack of the VES are represented by an "object reference" (effectively a pointer), which, taken by itself, is essentially typeless. How then does the VES ensure that we don't use members of an incompatible type? What stops us from calling the string.Length property on an instance of System.Random?

To enforce type safety, the VES uses metadata that describes the static type of each object reference, comparing the type of a method call's receiver to the type identified by the method's metadata token (this applies to access of other member types as well).

For example, to call a method of the object's class, the reference to the object must be on the top of the virtual evaluation stack. The static type of this reference is known thanks to the method's metadata and analysis of the "stack transition" -- the changes in the state of the stack caused by each IL instruction. The call or callvirt instruction then indicates the method to be called by including a metadata token representing the method, which of course indicates the type on which the method is defined.

The VES "verifies" the code before compiling it, comparing the reference's type to that of the method. If the types are not compatible, verification fails, and the program crashes.

This works just as well for generic type parameters as it does for non-generic types. To achieve this, the VES limits the methods that can be called on an reference whose type is an unconstrained generic type parameter. The only allowed methods are those defined on System.Object, because all objects are instances of that type.

For a constrained parameter type, the references of that type can receive calls for methods defined by the types of the constraint. For example, if you write a method where you have constrained type T to be derived from ICollection, you can call the ICollection.Count getter on a reference of type T. The VES knows that it is safe to call this getter because it ensures that any reference being stored to that position in the stack will be an instance of some type that implements the ICollection interface. No matter what the actual type of the object is, the JIT compiler can therefore use the same native code.

Consider also fields that depend on the generic type parameter. In the case of List<T>, there is an array of type T[] that holds the elements in the list. Remember that the actual in-memory array will be an array of O object references. The native code to construct that array, or to read or write its elements, looks just the same regardless of whether the array is a member of a List<string> or of a List<FileInfo>.

So, within the scope of an unconstrained generic type such as List<T>, the T references are just as good as System.Object references. The advantage of generics, though, is that the VES substitutes the type argument for the type parameter in the caller's scope. In other words, even though List<string> and List<FileInfo> treat their elements the same internally, the callers see that the Find method of the one returns a string, while that of the other returns a FileInfo.

Finally, because all of this is achieved by metadata in the IL, and because the VES uses the metadata when it loads and JIT-compiles the types, the information can be extracted at run time through reflection.

like image 198
phoog Avatar answered Oct 17 '22 15:10

phoog


You asked how casts (including is and as) can work on variables of a generic type parameter. Since all objects store metadata about their own type all casts work the same way as if you had used the variable type object. The object is interrogated about its type and a runtime decision is being made.

Of course this technique is only valid for reference types. For value types the JIT compiles one specialized native method for each value type that is used to instantiate the generic type parameters. In that specialized method the type of T is exactly known. No further "magic" is needed. Value type parameters are therefore a "boring" case. To the JIT it looks like there are no generic type parameters at all.

How can typeof(T) work? This value is passed as a hidden parameter to generic methods. This is also how someObj as T is able to work. I'm quite sure it's being compiled as a call to a runtime helper (e.g. RuntimeCastHelper(someObj, typeof(T))).

like image 39
usr Avatar answered Oct 17 '22 14:10

usr