I always assumed that if I was using Select(x=> ...)
in the context of LINQ to objects, then the new collection would be immediately created and remain static. I'm not quite sure WHY I assumed this, and its a very bad assumption but I did. I often use .ToList()
elsewhere, but often not in this case.
This code demonstrates that even a simple 'Select' is subject to deferred execution :
var random = new Random(); var animals = new[] { "cat", "dog", "mouse" }; var randomNumberOfAnimals = animals.Select(x => Math.Floor(random.NextDouble() * 100) + " " + x + "s"); foreach (var i in randomNumberOfAnimals) { testContextInstance.WriteLine("There are " + i); } foreach (var i in randomNumberOfAnimals) { testContextInstance.WriteLine("And now, there are " + i); }
This outputs the following (the random function is called every time the collection is iterated through):
There are 75 cats There are 28 dogs There are 62 mouses And now, there are 78 cats And now, there are 69 dogs And now, there are 43 mouses
I have many places where I have an IEnumerable<T>
as a member of a class. Often the results of a LINQ query are assigned to such an IEnumerable<T>
. Normally for me, this does not cause issues, but I have recently found a few places in my code where it poses more than just a performance issue.
In trying to check for places where I had made this mistake I thought I could check to see if a particular IEnumerable<T>
was of type IQueryable
. This I thought would tell me if the collection was 'deferred' or not. It turns out that the enumerator created by the Select operator above is of type System.Linq.Enumerable+WhereSelectArrayIterator``[System.String,System.String]
and not IQueryable
.
I used Reflector to see what this interface inherited from, and it turns out not to inherit from anything that indicates it is 'LINQ' at all - so there is no way to test based upon the collection type.
I'm quite happy now putting .ToArray()
everywhere now, but I'd like to have a mechanism to make sure this problem doesn't happen in future. Visual Studio seems to know how to do it because it gives a message about 'expanding the results view will evaluate the collection.'
The best I have come up with is :
bool deferred = !object.ReferenceEquals(randomNumberOfAnimals.First(), randomNumberOfAnimals.First());
Edit: This only works if a new object is created with 'Select' and it not a generic solution. I'm not recommended it in any case though! It was a little tongue in the cheek of a solution.
You can implement deferred execution for your custom extension methods for IEnumerable using the yield keyword of C#. For example, you can implement custom extension method GetTeenAgerStudents for IEnumerable that returns a list of all students who are teenagers.
The main difference between IEnumerable and IQueryable in C# is that IQueryable queries out-of-memory data stores, while IEnumerable queries in-memory data. Moreover, IQueryable is part of . NET's System. LINQ namespace, while IEnumerable is in System.
All LINQ methods are extension methods to the IEnumerable<T> interface. That means that you can call any LINQ method on any object that implements IEnumerable<T> . You can even create your own classes that implement IEnumerable<T> , and those classes will instantly "inherit" all LINQ functionality!
Deferred execution of LINQ has trapped a lot of people, you're not alone.
The approach I've taken to avoiding this problem is as follows:
Parameters to methods - use IEnumerable<T>
unless there's a need for a more specific interface.
Local variables - usually at the point where I create the LINQ, so I'll know whether lazy evaluation is possible.
Class members - never use IEnumerable<T>
, always use List<T>
. And always make them private.
Properties - use IEnumerable<T>
, and convert for storage in the setter.
public IEnumerable<Person> People { get { return people; } set { people = value.ToList(); } } private List<People> people;
While there are theoretical cases where this approach wouldn't work, I've not run into one yet, and I've been enthusiasticly using the LINQ extension methods since late Beta.
BTW: I'm curious why you use ToArray();
instead of ToList();
- to me, lists have a much nicer API, and there's (almost) no performance cost.
Update: A couple of commenters have rightly pointed out that arrays have a theoretical performance advantage, so I've amended my statement above to "... there's (almost) no performance cost."
Update 2: I wrote some code to do some micro-benchmarking of the difference in performance between Arrays and Lists. On my laptop, and in my specific benchmark, the difference is around 5ns (that's nanoseconds) per access. I guess there are cases where saving 5ns per loop would be worthwhile ... but I've never come across one. I had to hike my test up to 100 million iterations before the runtime became long enough to accurately measure.
In general, I'd say you should try to avoid worrying about whether it's deferred.
There are advantages to the streaming execution nature of IEnumerable<T>
. It is true - there are times that it's disadvantageous, but I'd recommend just always handling those (rare) times specifically - either go ToList()
or ToArray()
to convert it to a list or array as appropriate.
The rest of the time, it's better to just let it be deferred. Needing to frequently check this seems like a bigger design problem...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With