I'm hoping to get some clarification on a snippet that I've recently stepped through in the debugger, but simply cannot really understand.
I'm taking a C# course on PluralSight and the current topic is on yield
and returning a IEnumerable<T>
with the keyword.
I've got this overly basic function that returns an IEnumerable
collection of Vendors
(A simple class with Id
, CompanyName
and Email
):
public IEnumerable<Vendor> RetrieveWithIterator()
{
this.Retrieve(); // <-- I've got a breakpoint here
foreach(var vendor in _vendors)
{
Debug.WriteLine($"Vendor Id: {vendor.VendorId}");
yield return vendor;
}
}
And I've got this code in a unit test that I'm using to test the function:
var vendorIterator = repository.RetrieveWithIterator(); // <-- Why don't it enter function?
foreach (var item in vendorIterator) // <-- But starts here?
{
Debug.WriteLine(item);
}
var actual = vendorIterator.ToList();
What I really can't seem to understand, and I'm sure a lot of beginners are having the same trouble, is why the initial call to RetrieveWithIterator
doesn't initiate the function, but it rather starts when we start iterating through its returned IEnumerable
collection (see the comments).
The yield keyword tells the compiler that the method in which it appears is an iterator block. yield return <expression>; yield break; The yield return statement returns one element at a time. The return type of yield keyword is either IEnumerable or IEnumerator .
In a normal (non-iterating) method you would use the return keyword. But you can't use return in an iterator, you have to use yield break . In other words, yield break for an iterator is the same as return for a standard method. Whereas, the break statement just terminates the closest loop.
Description. The yield keyword pauses generator function execution and the value of the expression following the yield keyword is returned to the generator's caller. It can be thought of as a generator-based version of the return keyword. yield can only be called directly from the generator function that contains it.
It specifies that an iterator has come to an end. You can think of yield break as a return statement which does not return a value. For example, if you define a function as an iterator, the body of the function may look like this: for (int i = 0; i < 5; i++) { yield return i; } Console.
This is called deferred execution, yield
is lazy and will only work as much as it needs to.
This has great many advantages, one of which being that you can create seemingly infinite enumerations:
public IEnumerable<int> InfiniteOnes()
{
while (true)
yield 1;
}
Now imagine that the following:
var infiniteOnes = InfiniteOnes();
Would execute eagerly, you'd have a StackOverflow
exception coming your way quite happily.
On the other hand, because its lazy, you can do the following:
var infiniteOnes = InfiniteOnes();
//.... some code
foreach (var one in infiniteOnes.Take(100)) { ... }
And later,
foreach (var one in infiniteOnes.Take(10000)) { ... }
Iterator blocks will run only when they need to; when the enumeration is iterated, not before, not after.
From msdn:
Deferred execution means that the evaluation of an expression is delayed until its realized value is actually required. Deferred execution can greatly improve performance when you have to manipulate large data collections, especially in programs that contain a series of chained queries or manipulations. In the best case, deferred execution enables only a single iteration through the source collection.
Deferred execution is supported directly in the C# language by the yield keyword (in the form of the yield-return statement) when used within an iterator block. Such an iterator must return a collection of type IEnumerator
or IEnumerator<T>
(or a derived type).
var vendorIterator = repository.RetrieveWithIterator(); // <-- Lets deferred the execution
foreach (var item in vendorIterator) // <-- execute it because we need it
{
Debug.WriteLine(item);
}
var actual = vendorIterator.ToList();
When you write a method that implements deferred execution, you also have to decide whether to implement the method using lazy evaluation or eager evaluation.
Lazy evaluation usually yields better performance because it distributes overhead processing evenly throughout the evaluation of the collection and minimizes the use of temporary data. Of course, for some operations, there is no other option than to materialize intermediate results.
source
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With