I was reading through a question asking Is it better to call ToList() or ToArray() in LINQ queries? and found myself wondering why Enumerable.ToArray()
wouldn't first just call the Count()
method to find the size of the collection instead of using the internal Buffer{T}
class which dynamically resizes itself. Something like the following:
T[] ToArray<T>(IEnumerable<T> source)
{
var count = source.Count();
var array = new T[count];
int index = 0;
foreach (var item in source) array[index++] = item;
return array;
}
I know that we can't understand what is going through the minds of the designers and implementers and I'm sure they're much smarter than myself. So the best way to ask this question is what's wrong with the approach shown above? It seems to be less memory allocation and still operates in O(n) time.
First, the Buffer<T>
class constructor also optimizes if the specified sequence can be casted to ICollection
(like array or list) which has a Count
property:
TElement[] array = null;
int num = 0;
ICollection<TElement> collection = source as ICollection<TElement>;
if (collection != null)
{
num = collection.Count;
if (num > 0)
{
array = new TElement[num];
collection.CopyTo(array, 0);
}
}
else
// now we are going the long way ...
So if it's not a collection the query must be executed to get the total count. But using Enumerable.Count
just to initialize the array correctly sized can be very expensive and - more important - could have dangerous side-effects. Hence it is unsafe.
Consider this simple File.ReadLines
example:
var lines = File.ReadLines(path);
int count = lines.Count(); // executes the query which also disposes the underlying IO.TextReader
var array = new string[count];
int index = 0;
foreach (string line in lines) array[index++] = line;
This will throw an ObjectDisposedException
"Cannot read from a closed TextReader" since lines.Count()
already executed the query and in the meantime the reader is disposed at foreach
.
The Buffer<T>
class has an optimization for the case where the source sequence implements ICollection<T>
:
internal Buffer(IEnumerable<TElement> source)
{
int length = 0;
TElement[] array = null;
ICollection<TElement> collection = source as ICollection<TElement>;
if (collection != null)
{
length = collection.Count;
if (length > 0)
{
array = new TElement[length];
collection.CopyTo(array, 0);
}
}
else
{
...
If the sequence doesn't implement ICollection<T>
, the code cannot assume that it's safe to enumerate the sequence twice, so it falls back to resizing the array as required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With