Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a performance impact when calling ToList()?

People also ask

Is ToList fast?

ToList() consistently runs faster, and would be a better choice if you're not planning to hang on to the results for a long time.

What does ToList () do?

The ToList<TSource>(IEnumerable<TSource>) method forces immediate query evaluation and returns a List<T> that contains the query results. You can append this method to your query in order to obtain a cached copy of the query results.

Is ToArray faster than ToList?

There is no significant difference between ToArray() and ToList() for most of the use-cases! But, it could be better to use ToArray() if the only use case is to iterate the result collection.

Why do we use ToList ()?

The tolist() function is used to convert a given array to an ordinary list with the same items, elements, or values.


IEnumerable.ToList()

Yes, IEnumerable<T>.ToList() does have a performance impact, it is an O(n) operation though it will likely only require attention in performance critical operations.

The ToList() operation will use the List(IEnumerable<T> collection) constructor. This constructor must make a copy of the array (more generally IEnumerable<T>), otherwise future modifications of the original array will change on the source T[] also which wouldn't be desirable generally.

I would like to reiterate this will only make a difference with a huge list, copying chunks of memory is quite a fast operation to perform.

Handy tip, As vs To

You'll notice in LINQ there are several methods that start with As (such as AsEnumerable()) and To (such as ToList()). The methods that start with To require a conversion like above (ie. may impact performance), and the methods that start with As do not and will just require some cast or simple operation.

Additional details on List<T>

Here is a little more detail on how List<T> works in case you're interested :)

A List<T> also uses a construct called a dynamic array which needs to be resized on demand, this resize event copies the contents of an old array to the new array. So it starts off small and increases in size if required.

This is the difference between the Capacity and Count attributes on List<T>. Capacity refers to the size of the array behind the scenes, Count is the number of items in the List<T> which is always <= Capacity. So when an item is added to the list, increasing it past Capacity, the size of the List<T> is doubled and the array is copied.


Is there a performance impact when calling toList()?

Yes of course. Theoretically even i++ has a performance impact, it slows the program for maybe a few ticks.

What does .ToList do?

When you invoke .ToList, the code calls Enumerable.ToList() which is an extension method that return new List<TSource>(source). In the corresponding constructor, under the worst circumstance, it goes through the item container and add them one by one into a new container. So its behavior affects little on performance. It's impossible to be a performance bottle neck of your application.

What's wrong with the code in the question

Directory.GetFiles goes through the folder and returns all files' names immediately into memory, it has a potential risk that the string[] costs a lot of memory, slowing down everything.

What should be done then

It depends. If you(as well as your business logic) gurantees that the file amount in the folder is always small, the code is acceptable. But it's still suggested to use a lazy version: Directory.EnumerateFiles in C#4. This is much more like a query, which will not be executed immediately, you can add more query on it like:

Directory.EnumerateFiles(myPath).Any(s => s.Contains("myfile"))

which will stop searching the path as soon as a file whose name contains "myfile" is found. This is obviously has a better performance then .GetFiles.


Is there a performance impact when calling toList()?

Yes there is. Using the extension method Enumerable.ToList() will construct a new List<T> object from the IEnumerable<T> source collection which of course has a performance impact.

However, understanding List<T> may help you determine if the performance impact is significant.

List<T> uses an array (T[]) to store the elements of the list. Arrays cannot be extended once they are allocated so List<T> will use an over-sized array to store the elements of the list. When the List<T> grows beyond the size the underlying array a new array has to be allocated and the contents of the old array has to be copied to the new larger array before the list can grow.

When a new List<T> is constructed from an IEnumerable<T> there are two cases:

  1. The source collection implements ICollection<T>: Then ICollection<T>.Count is used to get the exact size of the source collection and a matching backing array is allocated before all elements of the source collection is copied to the backing array using ICollection<T>.CopyTo(). This operation is quite efficient and probably will map to some CPU instruction for copying blocks of memory. However, in terms of performance memory is required for the new array and CPU cycles are required for copying all the elements.

  2. Otherwise the size of the source collection is unknown and the enumerator of IEnumerable<T> is used to add each source element one at a time to the new List<T>. Initially the backing array is empty and an array of size 4 is created. Then when this array is too small the size is doubled so the backing array grows like this 4, 8, 16, 32 etc. Every time the backing array grows it has to be reallocated and all elements stored so far have to be copied. This operation is much more costly compared to the first case where an array of the correct size can be created right away.

    Also, if your source collection contains say 33 elements the list will end up using an array of 64 elements wasting some memory.

In your case the source collection is an array which implements ICollection<T> so the performance impact is not something you should be concerned about unless your source array is very large. Calling ToList() will simply copy the source array and wrap it in a List<T> object. Even the performance of the second case is not something to worry about for small collections.


It will be as (in)efficient as doing:

var list = new List<T>(items);

If you disassemble the source code of the constructor that takes an IEnumerable<T>, you will see it will do a few things:

  • Call collection.Count, so if collection is an IEnumerable<T>, it will force the execution. If collection is an array, list, etc. it should be O(1).

  • If collection implements ICollection<T>, it will save the items in an internal array using the ICollection<T>.CopyTo method. It should be O(n), being n the length of the collection.

  • If collection does not implement ICollection<T>, it will iterate through the items of the collection, and will add them to an internal list.

So, yes, it will consume more memory, since it has to create a new list, and in the worst case, it will be O(n), since it will iterate through the collection to make a copy of each element.


"is there a performance impact that needs to be considered?"

The issue with your precise scenario is that first and foremost your real concern about performance would be from the hard-drive speed and efficiency of the drive's cache.

From that perspective, the impact is surely negligible to the point that NO it need not be considered.

BUT ONLY if you really need the features of the List<> structure to possibly either make you more productive, or your algorithm more friendly, or some other advantage. Otherwise, you're just purposely adding an insignificant performance hit, for no reason at all. In which case, naturally, you shouldn’t do it! :)