Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the compiler-generated enumerator for "yield" not a struct?

The compiler-generated implementation of IEnumerator / IEnumerable for yield methods and getters seems to be a class, and is therefore allocated on the heap. However, other .NET types such as List<T> specifically return struct enumerators to avoid useless memory allocation. From a quick overview of the C# In Depth post, I see no reason why that couldn't also be the case here.

Am I missing something?

like image 534
Lazlo Avatar asked Jan 13 '16 03:01

Lazlo


People also ask

Why we use yield keyword in c#?

The yield keyword performs custom and stateful iteration and returns each element of a collection one at a time sans the need of creating temporary collections. The yield keyword, first introduced in C# 2.0, T returns an object that implements the IEnumerable interface.

How is yield implemented in C#?

The C# yield keyword signals to the compiler that the method in which it appears is an iterator block. The compiler generates a class to implement the behavior that is expressed in the iterator block.

Does Linq use yield?

This keyword is used to return items from a loop within a method and retain the state of the method through multiple calls. Yield returns IEnumerator or generic IEnumerator<T>. This is very useful in LINQ query expressions in C# 3.0 as this provides the iterations required in LINQ queries.


2 Answers

Servy correctly answered your question -- a question you answered yourself in a comment:

I just realized that since the return type is an interface, it would get boxed anyway, is that right?

Right. Your follow up question is:

couldn't the method be changed to return an explicitly typed enumerator (like List<T> does)?

So your idea here is that the user writes:

IEnumerable<int> Blah() ...

and the compiler actually generates a method that returns BlahEnumerable which is a struct that implements IEnumerable<int>, but with the appropriate GetEnumerator etc methods and properties that allow the "pattern matching" feature of foreach to elide the boxing.

Though that is a plausible idea, there are serious difficulties involved when you start lying about the return type of a method. Particularly when the lie involves changing whether the method returns a struct or a reference type. Think of all the things that go wrong:

  • Suppose the method is virtual. How can it be overridden? The return type of a virtual override method must match exactly the overridden method. (And similarly for: the method overrides another method, the method implements a method of an interface, and so on.)

  • Suppose the method is made into a delegate Func<IEnumerable<int>>. Func<T> is covariant in T, but covariance only applies to type arguments of reference type. The code looks like it returns an IEnumerable<T> but in fact it returns a value type that is not covariance-compatible with IEnumerable<T>, only assignment compatible.

  • Suppose we have void M<T>(T t) where T : class and we call M(Blah()). We expect to deduce that T is IEnumerable<int>, which passes the constraint check, but the struct type does not pass the constraint check.

And so on. You rapidly end up in an episode of Three's Company (boy am I dating myself here) where a small lie ends up compounding into a huge disaster. All of this to save a small amount of collection pressure. Not worth it.

I note though that the implementation created by the compiler does save on collection pressure in one interesting way. The first time that GetEnumerator is called on the returned enumerable, the enumerable turns itself into an enumerator. The second time of course the state is different so it allocates a new object. Since the 99.99% likely scenario is that a given sequence is enumerated exactly once, this is a big savings on collection pressure.

like image 169
Eric Lippert Avatar answered Oct 16 '22 22:10

Eric Lippert


That class will only ever be used through the interface. If it were a struct, it would be boxed 100% of the time, making it less efficient than using a class.

You can't not box it, as it is, by definition, impossible to use the type at compile time as it doesn't exist when you start compiling the code.

When writing a custom implementation of IEnumerator you can expose the actual underlying type before compiling the code, allowing it to be potentially used without being boxed.

like image 43
Servy Avatar answered Oct 16 '22 21:10

Servy