I am implementing my own enumerable type. Something ressembling this:
public class LineReaderEnumerable : IEnumerable<string>, IDisposable
{
private readonly LineEnumerator enumerator;
public LineReaderEnumerable(FileStream fileStream)
{
enumerator = new LineEnumerator(new StreamReader(fileStream, Encoding.Default));
}
public IEnumerator<string> GetEnumerator()
{
return enumerator;
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public void Dispose()
{
enumerator.Dispose();
}
}
The enumerator class:
public class LineEnumerator : IEnumerator<string>
{
private readonly StreamReader reader;
private string current;
public LineEnumerator(StreamReader reader)
{
this.reader = reader;
}
public void Dispose()
{
reader.Dispose();
}
public bool MoveNext()
{
if (reader.EndOfStream)
{
return false;
}
current = reader.ReadLine();
return true;
}
public void Reset()
{
reader.DiscardBufferedData();
reader.BaseStream.Seek(0, SeekOrigin.Begin);
reader.BaseStream.Position = 0;
}
public string Current
{
get { return current; }
}
object IEnumerator.Current
{
get { return Current; }
}
}
My question is this: should I call Reset() on the enumerator when GetEnumerator() is called or is it the responsability of the calling method (like foreach) to do it?
Should GetEnumerator() create a new one, or is it supposed to always return the same instance?
Your model is fundamentally broken - you should create a new IEnumerator<T>
each time GetEnumerator()
is called. Iterators are meant to be independent of each other. For example, I ought to be able to write:
var lines = new LinesEnumerable(...);
foreach (var line1 in lines)
{
foreach (var line2 in lines)
{
...
}
}
and basically get the cross-product of each line in the file against each of the other lines.
This means LineEnumerable
class should not be given a FileStream
- it should be given something which can be used to obtain a FileStream
each time you need one, e.g. a filename.
For example, you can do all of this in a single method call using iterator blocks:
// Like File.ReadLines in .NET 4 - except that's broken (see comments)
public IEnumerable<string> ReadLines(string filename)
{
using (TextReader reader = File.OpenText(filename))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
Then:
var lines = ReadLines(filename);
// foreach loops as before
... that will work fine.
EDIT: Note that certain sequences are naturally iterable only once - e.g. a network stream, or a sequence of random numbers from an unknown seed.
Such sequences are really better expressed as IEnumerator<T>
rather than IEnumerable<T>
, but that makes filtering etc with LINQ harder. IMO such sequences should at least throw an exception on the second call to GetEnumerator()
- returning the same iterator twice is a really bad idea.
The expectation of a user of your type would be that GetEnumerator()
returns a new enumerator object.
As you have defined it every call to GetEnumerator
returns the same enumerator, so code like:
var e1 = instance.GetEnumerator();
e1.MoveNext();
var first = e1.Value();
var e2 = instance.GetEnumerator();
e2.MoveNext();
var firstAgain = e2.Value();
Debug.Assert(first == firstAgain);
will not work as expected.
(An internal call to Reset
would be an unusual design, but that's secondary here.)
Additional: PS If you want an enumerator over the lines of a file then use File.ReadLines
, but it appears (see comments on Jon Skeet's answer) this suffers from the same problem as your code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With