Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How the CLR implements IEnumerable<T> on Arrays? [duplicate]

So as you may know, arrays in C# implement IList<T>, among other interfaces. Somehow though, they do this without publicly implementing the Count property of IList<T>! Arrays have only a Length property.

Is this a blatant example of C#/.NET breaking its own rules about the interface implementation or am I missing something?

like image 465
MgSam Avatar asked Jun 22 '12 20:06

MgSam


6 Answers

So as you may know, arrays in C# implement IList<T>, among other interfaces

Well, yes, erm no, not really. This is the declaration for the Array class in the .NET 4 framework:

[Serializable, ComVisible(true)]
public abstract class Array : ICloneable, IList, ICollection, IEnumerable, 
                              IStructuralComparable, IStructuralEquatable
{
    // etc..
}

It implements System.Collections.IList, not System.Collections.Generic.IList<>. It can't, Array is not generic. Same goes for the generic IEnumerable<> and ICollection<> interfaces.

But the CLR creates concrete array types on the fly, so it could technically create one that implements these interfaces. This is however not the case. Try this code for example:

using System;
using System.Collections.Generic;

class Program {
    static void Main(string[] args) {
        var goodmap = typeof(Derived).GetInterfaceMap(typeof(IEnumerable<int>));
        var badmap = typeof(int[]).GetInterfaceMap(typeof(IEnumerable<int>));  // Kaboom
    }
}
abstract class Base { }
class Derived : Base, IEnumerable<int> {
    public IEnumerator<int> GetEnumerator() { return null; }
    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return GetEnumerator(); }
}

The GetInterfaceMap() call fails for a concrete array type with "Interface not found". Yet a cast to IEnumerable<> works without a problem.

This is quacks-like-a-duck typing. It is the same kind of typing that creates the illusion that every value type derives from ValueType which derives from Object. Both the compiler and the CLR have special knowledge of array types, just as they do of value types. The compiler sees your attempt at casting to IList<> and says "okay, I know how to do that!". And emits the castclass IL instruction. The CLR has no trouble with it, it knows how to provide an implementation of IList<> that works on the underlying array object. It has built-in knowledge of the otherwise hidden System.SZArrayHelper class, a wrapper that actually implements these interfaces.

Which it doesn't do explicitly like everybody claims, the Count property you asked about looks like this:

    internal int get_Count<T>() {
        //! Warning: "this" is an array, not an SZArrayHelper. See comments above
        //! or you may introduce a security hole!
        T[] _this = JitHelpers.UnsafeCast<T[]>(this);
        return _this.Length;
    }

Yes, you can certainly call that comment "breaking the rules" :) It is otherwise darned handy. And extremely well hidden, you can check this out in SSCLI20, the shared source distribution for the CLR. Search for "IList" to see where the type substitution takes place. The best place to see it in action is clr/src/vm/array.cpp, GetActualImplementationForArrayGenericIListMethod() method.

This kind of substitution in the CLR is pretty mild compared to what happens in the language projection in the CLR that allows writing managed code for WinRT (aka Metro). Just about any core .NET type gets substituted there. IList<> maps to IVector<> for example, an entirely unmanaged type. Itself a substitution, COM doesn't support generic types.

Well, that was a look at what happens behind the curtain. It can be very uncomfortable, strange and unfamiliar seas with dragons living at the end of the map. It can be very useful to make the Earth flat and model a different image of what's really going on in managed code. Mapping it to everybody favorite answer is comfortable that way. Which doesn't work so well for value types (don't mutate a struct!) but this one is very well hidden. The GetInterfaceMap() method failure is the only leak in the abstraction that I can think of.

like image 184
Hans Passant Avatar answered Nov 01 '22 12:11

Hans Passant


New answer in the light of Hans's answer

Thanks to the answer given by Hans, we can see the implementation is somewhat more complicated than we might think. Both the compiler and the CLR try very hard to give the impression that an array type implements IList<T> - but array variance makes this trickier. Contrary to the answer from Hans, the array types (single-dimensional, zero-based anyway) do implement the generic collections directly, because the type of any specific array isn't System.Array - that's just the base type of the array. If you ask an array type what interfaces it supports, it includes the generic types:

foreach (var type in typeof(int[]).GetInterfaces())
{
    Console.WriteLine(type);
}

Output:

System.ICloneable
System.Collections.IList
System.Collections.ICollection
System.Collections.IEnumerable
System.Collections.IStructuralComparable
System.Collections.IStructuralEquatable
System.Collections.Generic.IList`1[System.Int32]
System.Collections.Generic.ICollection`1[System.Int32]
System.Collections.Generic.IEnumerable`1[System.Int32]

For single-dimensional, zero-based arrays, as far as the language is concerned, the array really does implement IList<T> too. Section 12.1.2 of the C# specification says so. So whatever the underlying implementation does, the language has to behave as if the type of T[] implements IList<T> as with any other interface. From this perspective, the interface is implemented with some of the members being explicitly implemented (such as Count). That's the best explanation at the language level for what's going on.

Note that this only holds for single-dimensional arrays (and zero-based arrays, not that C# as a language says anything about non-zero-based arrays). T[,] doesn't implement IList<T>.

From a CLR perspective, something funkier is going on. You can't get the interface mapping for the generic interface types. For example:

typeof(int[]).GetInterfaceMap(typeof(ICollection<int>))

Gives an exception of:

Unhandled Exception: System.ArgumentException: Interface maps for generic
interfaces on arrays cannot be retrived.

So why the weirdness? Well, I believe it's really due to array covariance, which is a wart in the type system, IMO. Even though IList<T> is not covariant (and can't be safely), array covariance allows this to work:

string[] strings = { "a", "b", "c" };
IList<object> objects = strings;

... which makes it look like typeof(string[]) implements IList<object>, when it doesn't really.

The CLI spec (ECMA-335) partition 1, section 8.7.1, has this:

A signature type T is compatible-with a signature type U if and only if at least one of the following holds

...

T is a zero-based rank-1 array V[], and U is IList<W>, and V is array-element-compatible-with W.

(It doesn't actually mention ICollection<W> or IEnumerable<W> which I believe is a bug in the spec.)

For non-variance, the CLI spec goes along with the language spec directly. From section 8.9.1 of partition 1:

Additionally, a created vector with element type T, implements the interface System.Collections.Generic.IList<U>, where U := T. (§8.7)

(A vector is a single-dimensional array with a zero base.)

Now in terms of the implementation details, clearly the CLR is doing some funky mapping to keep the assignment compatibility here: when a string[] is asked for the implementation of ICollection<object>.Count, it can't handle that in quite the normal way. Does this count as explicit interface implementation? I think it's reasonable to treat it that way, as unless you ask for the interface mapping directly, it always behaves that way from a language perspective.

What about ICollection.Count?

So far I've talked about the generic interfaces, but then there's the non-generic ICollection with its Count property. This time we can get the interface mapping, and in fact the interface is implemented directly by System.Array. The documentation for the ICollection.Count property implementation in Array states that it's implemented with explicit interface implementation.

If anyone can think of a way in which this kind of explicit interface implementation is different from "normal" explicit interface implementation, I'd be happy to look into it further.

Old answer around explicit interface implementation

Despite the above, which is more complicated because of the knowledge of arrays, you can still do something with the same visible effects through explicit interface implementation.

Here's a simple standalone example:

public interface IFoo
{
    void M1();
    void M2();
}

public class Foo : IFoo
{
    // Explicit interface implementation
    void IFoo.M1() {}

    // Implicit interface implementation
    public void M2() {}
}

class Test    
{
    static void Main()
    {
        Foo foo = new Foo();

        foo.M1(); // Compile-time failure
        foo.M2(); // Fine

        IFoo ifoo = foo;
        ifoo.M1(); // Fine
        ifoo.M2(); // Fine
    }
}
like image 28
Jon Skeet Avatar answered Nov 01 '22 10:11

Jon Skeet


IList<T>.Count is implemented explicitly:

int[] intArray = new int[10];
IList<int> intArrayAsList = (IList<int>)intArray;
Debug.Assert(intArrayAsList.Count == 10);

This is done so that when you have a simple array variable, you don't have both Count and Length directly available.

In general, explicit interface implementation is used when you want to ensure that a type can be used in a particular way, without forcing all consumers of the type to think about it that way.

Edit: Whoops, bad recall there. ICollection.Count is implemented explicitly. The generic IList<T> is handled as Hans descibes below.

like image 30
dlev Avatar answered Nov 01 '22 11:11

dlev


Explicit interface implementation. In short, you declare it like void IControl.Paint() { } or int IList<T>.Count { get { return 0; } }.

like image 34
Tim S. Avatar answered Nov 01 '22 11:11

Tim S.


It's no different than an explicit interface implementation of IList. Just because you implement the interface doesn't mean its members need to appear as class members. It does implement the Count property, it just doesn't expose it on X[].

like image 26
nitzmahone Avatar answered Nov 01 '22 10:11

nitzmahone


With reference-sources being available:

//----------------------------------------------------------------------------------------
// ! READ THIS BEFORE YOU WORK ON THIS CLASS.
// 
// The methods on this class must be written VERY carefully to avoid introducing security holes.
// That's because they are invoked with special "this"! The "this" object
// for all of these methods are not SZArrayHelper objects. Rather, they are of type U[]
// where U[] is castable to T[]. No actual SZArrayHelper object is ever instantiated. Thus, you will
// see a lot of expressions that cast "this" "T[]". 
//
// This class is needed to allow an SZ array of type T[] to expose IList<T>,
// IList<T.BaseType>, etc., etc. all the way up to IList<Object>. When the following call is
// made:
//
//   ((IList<T>) (new U[n])).SomeIListMethod()
//
// the interface stub dispatcher treats this as a special case, loads up SZArrayHelper,
// finds the corresponding generic method (matched simply by method name), instantiates
// it for type <T> and executes it. 
//
// The "T" will reflect the interface used to invoke the method. The actual runtime "this" will be
// array that is castable to "T[]" (i.e. for primitivs and valuetypes, it will be exactly
// "T[]" - for orefs, it may be a "U[]" where U derives from T.)
//----------------------------------------------------------------------------------------
sealed class SZArrayHelper {
    // It is never legal to instantiate this class.
    private SZArrayHelper() {
        Contract.Assert(false, "Hey! How'd I get here?");
    }

    /* ... snip ... */
}

Specifically this part:

the interface stub dispatcher treats this as a special case, loads up SZArrayHelper, finds the corresponding generic method (matched simply by method name), instantiates it for type and executes it.

(Emphasis mine)

Source (scroll up).

like image 35
AnorZaken Avatar answered Nov 01 '22 10:11

AnorZaken