Must .NET's IList be finite? Suppose I write a class FibonacciList implementing IList<BigInteger>
We've now implemented all the methods expected of read-only ILists except Count().
Is this cool, or an abuse of IList?
Fibonacci numbers get impractically big quickly (hence IList<BigInteger>
above) . A bounded infinite sequence might be more sensible, it could implement IList<long>
or IList<double>
.
Addendum II: Fibonacci sequence may have been a bad example, because computing distant values is expensive - to find the nth value one has to compute all earlier values. Thus as Mošmondor said, one might as well make it an IEnumerable and use .ElementAt
. However there exist other sequences where one can compute distant values quickly without computing earlier values. (Surprisingly the digits of pi are such a sequence). These sequences are more 'listy', they truly support random access.
Edit: No-one argues against infinite IEnumerables. How do they handle Count()?
To most developers, IList
and ICollection
imply that you have a pre-evaluated, in-memory collection to work with. With IList
specifically, there is an implicit contract of constant-time Add
* and indexing operations. This is why LinkedList<T>
does not implement IList<T>
. I would consider a FibonacciList to be a violation of this implied contract.
Note the following paragraph from a recent MSDN Magazine article discussing the reasons for adding read-only collection interfaces to .NET 4.5:
IEnumerable<T>
is sufficient for most scenarios that deal with collections of types, but sometimes you need more power than it provides:
- Materialization:
IEnumerable<T>
does not allow you to express whether the collection is already available (“materialized”) or whether it’s computed every time you iterate over it (for example, if it represents a LINQ query). When an algorithm requires multiple iterations over the collection, this can result in performance degradation if computing the sequence is expensive; it can also cause subtle bugs because of identity mismatches when objects are being generated again on subsequent passes.
As others have pointed out, there is also the question of what you would return for .Count
.
It's perfectly fine to use IEnumerable
or IQueryable
in for such collections of data, because there is an expectation that these types can be lazily evaluated.
Regarding Edit 1: .Count()
is not implemented by the IEnumerable<T>
interface: it is an extension method. As such, developers need to expect that it can take any amount of time, and they need to avoid calling it in cases where they don't actually need to know the number of items. For example, if you just want to know whether an IEnumerable<T>
has any items, it's better to use .Any()
. If you know that there's a maximum number of items you want to deal with, you can use .Take()
. If a collection has more than int.MaxValue
items in it, .Count()
will encounter an operation overflow. So there are some workarounds that can help to reduce the danger associated with infinite sequences. Obviously if programmers haven't taken these possibilities into account, it can still cause problems, though.
Regarding Edit 2: If you're planning to implement your sequence in a way that indexing is constant-time, that addresses my main point pretty handily. Sixlettervariables's answer still holds true, though.
*Obviously there's more to this: Add
is only expected to work if IList.IsFixedSize
returns false
. Modification is only possible if IsReadOnly
returns false, etc. IList
was a poorly-thought-out interface in the first place: a fact which may finally be remedied by the introduction of read-only collection interfaces in .NET 4.5.
Having given this some additional thought, I've come to the personal opinion that IEnumerable<>
s should not be infinite either. In addition to materializing methods like .ToList()
, LINQ has several non-streaming operations like .OrderBy()
which must consume the entire IEnumerable<>
before the first result can be returned. Since so many methods assume IEnumerable<>
s are safe to traverse in their entirety, it would be a violation of the Liskov Substitution Principle to produce an IEnumerable<>
that is inherently unsafe to traverse indefinitely.
If you find that your application often requires segments of the Fibonacci sequence as IEnumerables, I'd suggest creating a method with a signature similar to Enumerable.Range(int, int)
, which allows the user to define a starting and ending index.
If you'd like to embark on a Gee-Whiz project, you could conceivably develop a Fibonacci-based IQueryable<>
provider, where users could use a limited subset of LINQ query syntax, like so:
// LINQ to Fibonacci!
var fibQuery = from n in Fibonacci.Numbers // (returns an IQueryable<>)
where n.Index > 5 && n.Value < 20000
select n.Value;
var fibCount = fibQuery.Count();
var fibList = fibQuery.ToList();
Since your query provider would have the power to evaluate the where
clauses as lambda expressions, you could have enough control to implement Count
methods and .GetEnumerator()
in a way as to ensure that the query is restrictive enough to produce a real answer, or throw an exception as soon as the method is called.
But this reeks of being clever, and would probably be a really bad idea for any real-life software.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With