Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance difference with various List initialization and population techniques

Tags:

I was trying to find out if setting the capacity of a list in the constructor could flatten the already negligible performance differences between using List<T>(IEnumerable<T>) vs List<T>() + AddRange(IEnumerable<T>).

I found out setting the capacity before AddRange() actually leads to roughly the same performance of constructing the list with an initial collection.

But there's more. What if you use Add() instead of AddRange()? For some reason it seems like the performance improves by a 32%.

So what if I use Add() without initializing the constructor of the list with a initial capacity? It seems like the performance are worst than the previous Add() approach, but still better than AddRange() approach: there's a performance boost of a 26%.

This is strange enough for me to wonder if my test was a valid one; so I post the tests I did (run in release mode without debugger).

Can anybody confirm this?

int items = 100000;
int cycles = 10000;

var collectionToCopy = Enumerable.Range( 0, items );

var sw0 = new Stopwatch();
sw0.Start();
for( int i = 0; i < cycles; i++ )
{
    List<int> list = new List<int>( collectionToCopy );
}
sw0.Stop();
Console.WriteLine( sw0.ElapsedMilliseconds );

var sw1 = new Stopwatch();
sw1.Start();
for( int i = 0; i < cycles; i++ )
{
    List<int> list = new List<int>( items );
    list.AddRange( collectionToCopy );
}
sw1.Stop();
Console.WriteLine( sw1.ElapsedMilliseconds );

var sw4 = new Stopwatch();
sw4.Start();
for( int i = 0; i < cycles; i++ )
{
    List<int> list = new List<int>();
    list.AddRange( collectionToCopy );
}
sw4.Stop();
Console.WriteLine( sw4.ElapsedMilliseconds );

var sw2 = new Stopwatch();
sw2.Start();
for( int i = 0; i < cycles; i++ )
{
    List<int> list = new List<int>( items );
    foreach( var item in collectionToCopy )
        list.Add( item );
}
sw2.Stop();
Console.WriteLine( sw2.ElapsedMilliseconds );

var sw3 = new Stopwatch();
sw3.Start();
for( int i = 0; i < cycles; i++ )
{
    List<int> list = new List<int>();
    foreach( var item in collectionToCopy )
        list.Add( item );
}
sw3.Stop();
Console.WriteLine( sw3.ElapsedMilliseconds );

RESULTS:

13400 - constructor initialized with collection

13423 - constructor initialized with capacity + AddRange(collection)

14857 - parameterless constructor + AddRange(collection)

9003 - constructor initialized with capacity + Add(item)

9841 - parameterless constructor + Add(item)

like image 940
Mauro Sampietro Avatar asked Mar 23 '17 16:03

Mauro Sampietro


1 Answers

The devil is in Enumerable.Range(0, items). Changing it to Enumerable.Range(0, items).ToList(); will get you the expected result.

AddRange(IEnumerable collectionToCopy) will check the actual type of the input. If the input is continuous(e.g. list, array, ...), it will try to copy. Otherwise, it falls back to enumeration and simple Add().

And this is the fallback code

using(IEnumerator<T> en = collection.GetEnumerator()) {
   while(en.MoveNext()) {
        Insert(index++, en.Current);                                    

In the fallback case, the worse performance given by List.AddRange may have come from the IEnumertor overhead

like image 179
Xiaoguo Ge Avatar answered Sep 23 '22 09:09

Xiaoguo Ge