I ran across this issue today and I'm not understanding what's going on:
enum Foo
{
Zero,
One,
Two
}
void Main()
{
IEnumerable<Foo> a = new Foo[]{ Foo.Zero, Foo.One, Foo.Two};
IEnumerable<Foo> b = a.ToList();
PrintGeneric(a.Cast<int>());
PrintGeneric(b.Cast<int>());
Print(a.Cast<int>());
Print(b.Cast<int>());
}
public static void PrintGeneric<T>(IEnumerable<T> values){
foreach(T value in values){
Console.WriteLine(value);
}
}
public static void Print(IEnumerable values){
foreach(object value in values){
Console.WriteLine(value);
}
}
Output:
0
1
2
0
1
2
Zero
One
Two
0
1
2
I know Cast() is going to result in deferred execution, but it looks like casting it to IEnumerable results in the deferred execution getting lost, and only if the actual implementing collection is an array.
Why is the enumeration of the values in the Print
method result in the enum
being cast to an int
for the List<Foo>
collection, but not the Foo[]
?
The secret lies in how LINQ cleverly uses the IEnumerator & IEnumerable combination to delay (defer) the actual execution. We can see straight away, that LINQ does some checks on the IEnumerable<TSource> object, and then, depending on the type of the object, picks an appropriate Iterator object.
IEnumerables also help ensure immutability, as you are always querying the source there are no unintended side effects. Lists and Arrays create objects in memory and allow access to a whole lot of methods associated with those types ( Lists | Arrays ).
Deferred execution is applicable on any in-memory collection as well as remote LINQ providers like LINQ-to-SQL, LINQ-to-Entities, LINQ-to-XML, etc. In the above example, you can see the query is materialized and executed when you iterate using the foreach loop. This is called deferred execution.
It's because of an optimization which is unfortunately slightly broken in the face of unexpected CLR conversions.
At the CLR level, there's a reference conversion from a Foo[]
to int[]
- you don't actually need to cast each object at all. That's not true at the C# level, but it is at the CLR level.
Now, Cast<>
contains an optimization to say "if I'm already dealing with a collection of the right type, I can just return the same reference back" - effectively like this:
if (source is IEnumerable<T>)
{
return source;
}
So a.Cast<int>
returns a
, which is a Foo[]
. That's fine when you pass it to PrintGeneric
, because then there's an implicit conversion to T
in the foreach
loop. The compiler knows that the type of IEnumerator<T>.Current
is T
, so the relevant stack slot is of type T
. The per-type-argument JIT-compiled code will "do the right thing" when treating the value as an int
rather than as a Foo
.
However, when you pass the array as an IEnumerable
, the Current
property on the IEnumerator
is just of type object
, so each value will be boxed and passed to Console.WriteLine(object)
- and the boxed object will be of type Foo
, not int
.
Here's some sample code to show the first part of this - the rest is a little simpler to understand, I believe, once you've got past that:
using System;
using System.Linq;
enum Foo { }
class Test
{
static void Main()
{
Foo[] x = new Foo[10];
// False because the C# compiler is cocky, and "optimizes" it out
Console.WriteLine(x is int[]);
// True because when we put a blindfold in front of the compiler,
// the evaluation is left to the CLR
Console.WriteLine(((object) x) is int[]);
// Foo[] and True because Cast returns the same reference back
Console.WriteLine(x.Cast<int>().GetType());
Console.WriteLine(ReferenceEquals(x, x.Cast<int>()));
}
}
You'll see the same thing if you try to go between uint[]
and int[]
by the way.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With