Currently I'm reading a collection of items from a stream. I do this as following:
public class Parser{
private TextReader _reader; //Get set in Constructor
private IEnumerable<Item> _items;
public IEnumerable<Item> Items{
get{
//I >>thought<< this would prevent LoadItems() from being called twice.
return _items ?? (_items = LoadItems());
}
}
public IEnumerable<Item> LoadItems(){
while(_reader.Peek() >= 0){
yield return new Item(_reader.ReadLine()); //Actually it's a little different
}
}
}
Let say I have a stream which contains two items, and I do the following:
var textReader = //Load textreader here
var parser = new Parser(textReader);
var result1 = parser.Items.Count();
var result2 = parser.Items.Count();
Now result1
is 2, while result2
is one.
Now I noticed, that my null check is useless? It seems that everytime I call that function it gets yielded anyway.
Can someone explain to me why this is? And what would be the best solution for this situation (please tell me if what I'm doing is complete crap :P).
yield can only be called directly from the generator function that contains it. It cannot be called from nested functions or from callbacks. The yield keyword causes the call to the generator's next() method to return an IteratorResult object with two properties: value and done .
The term "yield to call" refers to the return a bondholder receives if the security is held until the call date, prior to its date of maturity. Yield to call is applied to callable bonds, which are securities that let bond investors redeem the bonds (or the bond issuer to repurchase them) early, at the call price.
yield keyword is used to create a generator function. A type of function that is memory efficient and can be used like an iterator object. In layman terms, the yield keyword will turn any expression that is given with it into a generator object and return it to the caller.
Yield refers to how much income an investment generates, separate from the principal. It's commonly used to refer to interest payments an investor receives on a bond or dividend payments on a stock. Yield is often expressed as a percentage, based on either the investment's market value or purchase price.
Because LoadItems
is a lazy enumerable (uses yield
) and you are assigning it to a field, it means that every time you enumerate _items
you are actually causing the loop within LoadItems()
to be run again, i.e. (Enumerable.Count
creates a new Enumerator
each time which cause the LoadItems
body to be run again). As you are not creating the reader afresh each time within LoadItems
its cursor will be positioned at the end of the stream so will likely not be able to read any more lines — I suspect that it is returning null
and your your single Item
object returned on the second call contains a null
string.
Solutions to this would be to 'realise' the result of LoadItems
by calling Enumerable.ToList
which will give you a concrete list:
return _items ?? (_items = LoadItems().ToList());
Or seeking the reader back to the beginning of the stream (if possible) such that LoadItems
can run again each time identically.
But I would recommend you simply get rid of the yield
ing in this case and return a concrete list as there is little benefit so you are paying the complexity price for no gain.
Your variable name has led you astray. At the moment:
private IEnumerable<Item> _items;
you are lazy-loading and saving the iterator, whereas you probably want to be lazy-loading and saving the items (as the variable name suggests):
public class Parser{
private TextReader _reader; //Get set in Constructor
private List<Item> _items;
public IEnumerable<Item> Items{
get{
return _items ?? (_items = LoadItems().ToList());
}
}
private IEnumerable<Item> LoadItems(){
while(_reader.Peek() >= 0){
yield return new Item(_reader.ReadLine()); //Actually it's a little different
}
}
}
Consider that yield
is used as short hand. Your code gets turned into something like:
private class <>ImpossibleNameSoItWontCollide : IEnumerator<Item>
{
private TextReader _rdr;
/* other state-holding fields */
public <>ImpossibleNameSoItWontCollide(TextReader rdr)
{
_rdr = rdr;
}
/* Implement MoveNext, Current here */
}
private class <>ImpossibleNameSoItWontCollide2 : IEnumerable<Item>
{
private TextReader _rdr;
/* other state-holding fields */
public <>ImpossibleNameSoItWontCollide2(TextReader rdr)
{
_rdr = rdr;
}
public <>ImpossibleNameSoItWontCollide GetEnumerator()
{
return new <>ImpossibleNameSoItWontCollide(_rdr);
}
/* etc */
}
public IEnumerable<Item> LoadItems()
{
return new <>ImpossibleNameSoItWontCollide2(_rdr);
}
Hence LoadItems()
is indeed only called once, but the object it returns has GetEnumerator()
called on it twice.
And since the TextReader
has moved on, this gives the wrong results for you. Though do note that it leads to a lower use of memory than holding onto all the items, so it's got benefits when you don't want to use the same set of items twice.
Since you do want to, you need to create an object that stores them:
return _items = _items ?? _items = LoadItems().ToList();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With