IEnumerable does not guarantee that enumerating twice will yield the same result. In fact, it's quite easy to create an example where myEnumerable.First()
returns different values when executed twice:
class A {
public A(string value) { Value = value; }
public string Value {get; set; }
}
static IEnumerable<A> getFixedIEnumerable() {
return new A[] { new A("Hello"), new A("World") };
}
static IEnumerable<A> getDynamicIEnumerable() {
yield return new A("Hello");
yield return new A("World");
}
static void Main(string[] args)
{
IEnumerable<A> fix = getFixedIEnumerable();
IEnumerable<A> dyn = getDynamicIEnumerable();
Console.WriteLine(fix.First() == fix.First()); // true
Console.WriteLine(dyn.First() == dyn.First()); // false
}
This is not just an academic example: Using the popular from ... in ... select new A(...)
will create exactly this situation. This can lead to unexpected behaviour:
fix.First().Value = "NEW";
Console.WriteLine(fix.First().Value); // prints NEW
dyn.First().Value = "NEW";
Console.WriteLine(dyn.First().Value); // prints Hello
I understand why this happens. I also know that this could be fixed by executing ToList()
on the Enumerable or by overriding ==
for class A
. That's not my question.
The question is: When you write a method that takes an arbitrary IEnumerable and you want the property that the sequence is only evaluated once (and then the references are "fixed"), what's the canonical way to do this? ToList()
seems to be used mostly, but if the source is fixed already (for example, if the source is an array), the references are copied to a list (unnecessarily, since all I need is the fixed property). Is there something more suitable or is ToList()
the "canonical" solution for this issue?
ToList
is very definitely the way to go.
It has some optimisations: if the input enumeration is an ICollection
(e.g. an array or list) it will call ICollection.CopyTo
to copy the collection to an array rather than actually enumerating - so the cost is unlikely to be significant unless the collection is enormous.
IMHO, in general it is better for most methods to return ICollection<T>
or IList<T>
rather than IEnumerable<T>
, unless you want to hint to consumers of the method that the implementation may use lazy evaluation (e.g. yield).
In cases where a method should return an immutable list, return a readonly wrapper (ReadOnlyCollection<T>
) e.g. by calling ToList().AsReadOnly()
, but still return the interface type IList<T>
.
If you follow this guideline, consumers of the method won't ever need an unnecessary call to ToList
.
You mean if your function takes a parameter and you want to make sure the parameter is not lazy, because you need to iterate over it more than once in the method? I typically make the parameter an ICollection
instead in that case, making it the responsibility of the caller to reify the enumerable if it's lazy.
The IEnumerable
interface just give "a way" to iterate through elements (creating a IEnumerable from a linq query is a perfect example as IEnumerable
is like a SQL query for a database). It will always be dynamic until you store the result into an ICollection
(ToList, ToArray), so the iteration is processed until the end and the result stored in a "fixed" way.
The most common problem is stated by Joel Spolsky in his Law of Leaky Abstractions. Having IEnumerable
parameter in your method you expect an object that just can be enumerated and nothing more. But nevertheless you do care of the passed collection whether it is "fixed" or "dynamic". You can see that abstraction made by IEnumerable
is leaked. There is no solution that fits well for all cases. In your case you can pass List<T>
or T[]
(or some other types you expect to be actual types for parameters in your method) instead of IEnumerable
. The most common advise is to realize your abstraction and design your code with respect to it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With