Consider the following enumerator:
var items = (new int[] { 1, 2, 3, 4, 5 }).Select(x =>
{
Console.WriteLine($"inspect {x}");
return x;
});
This yields the elements [1, 2, 3, 4, 5]
, printing them as they are consumed.
When I call the Last
method on this enumerator, it triggers a fast path which only accesses a single element:
items.Last();
inspect 5
But when I pass a callback to Last
, it loops through the whole list from the beginning:
items.Last(x => true);
inspect 1
inspect 2
inspect 3
inspect 4
inspect 5
Looking through the .NET Core source code, I find that:
Last(IEnumerable<T>)
forwards to TryGetLast(IEnumerable<T>, out bool)
;TryGetLast(IEnumerable<T>, out bool)
has a fast path for IPartition<T>
;ArraySelectIterator<T>
implements IPartition<T>
, this fast path is triggered and all is well.On the other hand:
Last(IEnumerable<T>, Func<T, bool>)
forwards to TryGetLast(IEnumerable<T>, Func<T, bool>, out bool)
OrderedEnumerator
and IList<T>
, but not ArraySelectIterator<T>
.This explains how the callback case is not optimized. But it doesn't explain why.
Conceptually, if at least one element satisfies the predicate (which is likely in practice), then iterating backward may allow for exiting the loop early.
It doesn't seem difficult to implement either: from what I've seen, all it takes is an additional method on IPartition<T>
.
The lack of optimization can also be surprising. Since these overloads share the same name, one might assume that they are also optimized in a similar way. (At least that's what I thought.)
Given these reasons to optimize this case, why did the authors of LINQ choose not to do that?
Last()
can be always optimized for collections that allow access to the last element of the collection in constant time (O(1)
). For those collections, instead of iterating all the collection and returning the last element, you can just access the last element directly.
Conceptually, if at least one element satisfies the predicate (which is likely in practice), then iterating backward may allow for exiting the loop early.
That assumption is way too far fetched for a generic implementation of Last(Func<T,bool>)
. You can't assume that the last element that satisfies the predicate is closer to the end of the collection in general. That optimization works well for your example (Last(x=>true)
), but for every such example there can be an oposite example where that optimization performs worse (Last(x=>false)
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With