Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

yield always gets called

Currently I'm reading a collection of items from a stream. I do this as following:

public class Parser{

 private TextReader _reader; //Get set in Constructor
 private IEnumerable<Item> _items;    

 public IEnumerable<Item> Items{
  get{
   //I >>thought<< this would prevent LoadItems() from being called twice.
   return _items ?? (_items = LoadItems());
  }
 }

 public IEnumerable<Item> LoadItems(){
  while(_reader.Peek() >= 0){
   yield return new Item(_reader.ReadLine()); //Actually it's a little different
  }
 }
}

Let say I have a stream which contains two items, and I do the following:

var textReader = //Load textreader here
var parser = new Parser(textReader);
var result1 = parser.Items.Count();
var result2 = parser.Items.Count();

Now result1 is 2, while result2 is one.

Now I noticed, that my null check is useless? It seems that everytime I call that function it gets yielded anyway.

Can someone explain to me why this is? And what would be the best solution for this situation (please tell me if what I'm doing is complete crap :P).

like image 499
Timo Willemsen Avatar asked Jan 26 '12 10:01

Timo Willemsen


People also ask

How do you call a yield function?

yield can only be called directly from the generator function that contains it. It cannot be called from nested functions or from callbacks. The yield keyword causes the call to the generator's next() method to return an IteratorResult object with two properties: value and done .

What is yield to first call?

The term "yield to call" refers to the return a bondholder receives if the security is held until the call date, prior to its date of maturity. Yield to call is applied to callable bonds, which are securities that let bond investors redeem the bonds (or the bond issuer to repurchase them) early, at the call price.

What is yield used for?

yield keyword is used to create a generator function. A type of function that is memory efficient and can be used like an iterator object. In layman terms, the yield keyword will turn any expression that is given with it into a generator object and return it to the caller.

What is yield in finance terms?

Yield refers to how much income an investment generates, separate from the principal. It's commonly used to refer to interest payments an investor receives on a bond or dividend payments on a stock. Yield is often expressed as a percentage, based on either the investment's market value or purchase price.


3 Answers

Because LoadItems is a lazy enumerable (uses yield) and you are assigning it to a field, it means that every time you enumerate _items you are actually causing the loop within LoadItems() to be run again, i.e. (Enumerable.Count creates a new Enumerator each time which cause the LoadItems body to be run again). As you are not creating the reader afresh each time within LoadItems its cursor will be positioned at the end of the stream so will likely not be able to read any more lines — I suspect that it is returning null and your your single Item object returned on the second call contains a null string.

Solutions to this would be to 'realise' the result of LoadItems by calling Enumerable.ToList which will give you a concrete list:

return _items ?? (_items = LoadItems().ToList());

Or seeking the reader back to the beginning of the stream (if possible) such that LoadItems can run again each time identically.

But I would recommend you simply get rid of the yielding in this case and return a concrete list as there is little benefit so you are paying the complexity price for no gain.

like image 179
Paul Ruane Avatar answered Oct 18 '22 12:10

Paul Ruane


Your variable name has led you astray. At the moment:

 private IEnumerable<Item> _items; 

you are lazy-loading and saving the iterator, whereas you probably want to be lazy-loading and saving the items (as the variable name suggests):

public class Parser{

 private TextReader _reader; //Get set in Constructor
 private List<Item> _items;    

 public IEnumerable<Item> Items{
  get{
   return _items ?? (_items = LoadItems().ToList());
  }
 }

 private IEnumerable<Item> LoadItems(){
  while(_reader.Peek() >= 0){
   yield return new Item(_reader.ReadLine()); //Actually it's a little different
  }
 }
}
like image 37
AakashM Avatar answered Oct 18 '22 10:10

AakashM


Consider that yield is used as short hand. Your code gets turned into something like:

private class <>ImpossibleNameSoItWontCollide : IEnumerator<Item>
{
   private TextReader _rdr;
   /* other state-holding fields */
   public <>ImpossibleNameSoItWontCollide(TextReader rdr)
   {
     _rdr = rdr;
   }
   /* Implement MoveNext, Current here */
}
private class <>ImpossibleNameSoItWontCollide2 : IEnumerable<Item>
{
   private TextReader _rdr;
   /* other state-holding fields */
   public <>ImpossibleNameSoItWontCollide2(TextReader rdr)
   {
     _rdr = rdr;
   }
   public <>ImpossibleNameSoItWontCollide GetEnumerator()
   {
     return new <>ImpossibleNameSoItWontCollide(_rdr);
   }
   /* etc */
}
public IEnumerable<Item> LoadItems()
{
    return new <>ImpossibleNameSoItWontCollide2(_rdr);
}

Hence LoadItems() is indeed only called once, but the object it returns has GetEnumerator() called on it twice.

And since the TextReader has moved on, this gives the wrong results for you. Though do note that it leads to a lower use of memory than holding onto all the items, so it's got benefits when you don't want to use the same set of items twice.

Since you do want to, you need to create an object that stores them:

return _items = _items ?? _items = LoadItems().ToList();
like image 1
Jon Hanna Avatar answered Oct 18 '22 11:10

Jon Hanna