Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trouble understanding yield in C# [duplicate]

I'm hoping to get some clarification on a snippet that I've recently stepped through in the debugger, but simply cannot really understand.

I'm taking a C# course on PluralSight and the current topic is on yield and returning a IEnumerable<T> with the keyword.

I've got this overly basic function that returns an IEnumerable collection of Vendors (A simple class with Id, CompanyName and Email):

public IEnumerable<Vendor> RetrieveWithIterator()
{
    this.Retrieve(); // <-- I've got a breakpoint here
    foreach(var vendor in _vendors)
    {
        Debug.WriteLine($"Vendor Id: {vendor.VendorId}");
        yield return vendor;
    }
}

And I've got this code in a unit test that I'm using to test the function:

var vendorIterator = repository.RetrieveWithIterator(); // <-- Why don't it enter function?
foreach (var item in vendorIterator) // <-- But starts here?
{
    Debug.WriteLine(item);
}
var actual = vendorIterator.ToList();

What I really can't seem to understand, and I'm sure a lot of beginners are having the same trouble, is why the initial call to RetrieveWithIterator doesn't initiate the function, but it rather starts when we start iterating through its returned IEnumerable collection (see the comments).

like image 296
geostocker Avatar asked Jun 28 '17 09:06

geostocker


People also ask

What does yield do in C?

The yield keyword tells the compiler that the method in which it appears is an iterator block. yield return <expression>; yield break; The yield return statement returns one element at a time. The return type of yield keyword is either IEnumerable or IEnumerator .

Does yield break a loop?

In a normal (non-iterating) method you would use the return keyword. But you can't use return in an iterator, you have to use yield break . In other words, yield break for an iterator is the same as return for a standard method. Whereas, the break statement just terminates the closest loop.

What is yield keyword?

Description. The yield keyword pauses generator function execution and the value of the expression following the yield keyword is returned to the generator's caller. It can be thought of as a generator-based version of the return keyword. yield can only be called directly from the generator function that contains it.

What is yield break?

It specifies that an iterator has come to an end. You can think of yield break as a return statement which does not return a value. For example, if you define a function as an iterator, the body of the function may look like this: for (int i = 0; i < 5; i++) { yield return i; } Console.


2 Answers

This is called deferred execution, yield is lazy and will only work as much as it needs to.

This has great many advantages, one of which being that you can create seemingly infinite enumerations:

public IEnumerable<int> InfiniteOnes()
{
     while (true)
         yield 1;
}

Now imagine that the following:

var infiniteOnes = InfiniteOnes();

Would execute eagerly, you'd have a StackOverflow exception coming your way quite happily.

On the other hand, because its lazy, you can do the following:

var infiniteOnes = InfiniteOnes();
//.... some code
foreach (var one in infiniteOnes.Take(100)) { ... }

And later,

foreach (var one in infiniteOnes.Take(10000)) { ... }

Iterator blocks will run only when they need to; when the enumeration is iterated, not before, not after.

like image 93
InBetween Avatar answered Oct 20 '22 04:10

InBetween


From msdn:

Deferred Execution

Deferred execution means that the evaluation of an expression is delayed until its realized value is actually required. Deferred execution can greatly improve performance when you have to manipulate large data collections, especially in programs that contain a series of chained queries or manipulations. In the best case, deferred execution enables only a single iteration through the source collection.

Deferred execution is supported directly in the C# language by the yield keyword (in the form of the yield-return statement) when used within an iterator block. Such an iterator must return a collection of type IEnumerator or IEnumerator<T> (or a derived type).

var vendorIterator = repository.RetrieveWithIterator(); // <-- Lets deferred the execution
foreach (var item in vendorIterator) // <-- execute it because we need it
{
    Debug.WriteLine(item);
}
var actual = vendorIterator.ToList();

Eager vs. Lazy Evaluation

When you write a method that implements deferred execution, you also have to decide whether to implement the method using lazy evaluation or eager evaluation.

  • In lazy evaluation, a single element of the source collection is processed during each call to the iterator. This is the typical way in which iterators are implemented.
  • In eager evaluation, the first call to the iterator will result in the entire collection being processed. A temporary copy of the source collection might also be required.

Lazy evaluation usually yields better performance because it distributes overhead processing evenly throughout the evaluation of the collection and minimizes the use of temporary data. Of course, for some operations, there is no other option than to materialize intermediate results.

source

like image 5
aloisdg Avatar answered Oct 20 '22 04:10

aloisdg