Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Does an Array Cast as IEnumerable Ignore Deferred Execution?

I ran across this issue today and I'm not understanding what's going on:

enum Foo
{
    Zero,
    One,
    Two
}

void Main()
{
    IEnumerable<Foo> a = new Foo[]{ Foo.Zero, Foo.One, Foo.Two};
    IEnumerable<Foo> b = a.ToList();

    PrintGeneric(a.Cast<int>());
    PrintGeneric(b.Cast<int>());

    Print(a.Cast<int>());
    Print(b.Cast<int>());
}

public static void PrintGeneric<T>(IEnumerable<T> values){
    foreach(T value in values){
        Console.WriteLine(value);
    }
}

public static void Print(IEnumerable values){
    foreach(object value in values){
        Console.WriteLine(value);
    }
}

Output:

0
1
2
0
1
2
Zero
One
Two
0
1
2

I know Cast() is going to result in deferred execution, but it looks like casting it to IEnumerable results in the deferred execution getting lost, and only if the actual implementing collection is an array.

Why is the enumeration of the values in the Print method result in the enum being cast to an int for the List<Foo> collection, but not the Foo[]?

like image 415
Daryl Avatar asked Nov 04 '13 16:11

Daryl


People also ask

Is IEnumerable deferred execution?

The secret lies in how LINQ cleverly uses the IEnumerator & IEnumerable combination to delay (defer) the actual execution. We can see straight away, that LINQ does some checks on the IEnumerable<TSource> object, and then, depending on the type of the object, picks an appropriate Iterator object.

Why use IEnumerable instead of Array?

IEnumerables also help ensure immutability, as you are always querying the source there are no unintended side effects. Lists and Arrays create objects in memory and allow access to a whole lot of methods associated with those types ( Lists | Arrays ).

Which LINQ methods have deferred execution?

Deferred execution is applicable on any in-memory collection as well as remote LINQ providers like LINQ-to-SQL, LINQ-to-Entities, LINQ-to-XML, etc. In the above example, you can see the query is materialized and executed when you iterate using the foreach loop. This is called deferred execution.


1 Answers

It's because of an optimization which is unfortunately slightly broken in the face of unexpected CLR conversions.

At the CLR level, there's a reference conversion from a Foo[] to int[] - you don't actually need to cast each object at all. That's not true at the C# level, but it is at the CLR level.

Now, Cast<> contains an optimization to say "if I'm already dealing with a collection of the right type, I can just return the same reference back" - effectively like this:

if (source is IEnumerable<T>)
{
    return source;
}

So a.Cast<int> returns a, which is a Foo[]. That's fine when you pass it to PrintGeneric, because then there's an implicit conversion to T in the foreach loop. The compiler knows that the type of IEnumerator<T>.Current is T, so the relevant stack slot is of type T. The per-type-argument JIT-compiled code will "do the right thing" when treating the value as an int rather than as a Foo.

However, when you pass the array as an IEnumerable, the Current property on the IEnumerator is just of type object, so each value will be boxed and passed to Console.WriteLine(object) - and the boxed object will be of type Foo, not int.

Here's some sample code to show the first part of this - the rest is a little simpler to understand, I believe, once you've got past that:

using System;
using System.Linq;

enum Foo { }

class Test
{
    static void Main()
    {
        Foo[] x = new Foo[10];
        // False because the C# compiler is cocky, and "optimizes" it out
        Console.WriteLine(x is int[]);

        // True because when we put a blindfold in front of the compiler,
        // the evaluation is left to the CLR
        Console.WriteLine(((object) x) is int[]);

        // Foo[] and True because Cast returns the same reference back
        Console.WriteLine(x.Cast<int>().GetType());
        Console.WriteLine(ReferenceEquals(x, x.Cast<int>()));
    }
}

You'll see the same thing if you try to go between uint[] and int[] by the way.

like image 164
Jon Skeet Avatar answered Oct 12 '22 22:10

Jon Skeet