Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C#: ToArray performance [duplicate]

Background:

I admit I did not attempt to benchmark this, but I'm curious...

What are the CPU/memory characteristics of the Enumerable.ToArray<T> (and its cousin Enumerable.ToList<T>)?

Since IEnumerable does not advertise in advance how many elements it has, I (perhaps naively) presume ToArray would have to "guess" an initial array size, and then to resize/reallocate the array if the first guess appears to be too small, then to resize it yet again if the second guess appears to be too small etc... Which would give worse-than-linear performance.

I can imagine better approaches involving (hybrid) lists, but this would still require more than one allocation (though not reallocation) and quite bit of copying, though it could be linear overall despite the overhead.

Question:

Is there any "magic" taking place behind the scenes, that avoids the need for this repetitive resizing, and makes ToArray linear in space and time?

More generally, is there an "official" documentation on BCL performance characteristics?

like image 513
Branko Dimitrijevic Avatar asked Jul 19 '11 16:07

Branko Dimitrijevic


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is C full form?

Originally Answered: What is the full form of C ? C - Compiler . C is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. C was originally first implemented on the DEC PDP-11 computer in 1972.

How old is the letter C?

The letter c was applied by French orthographists in the 12th century to represent the sound ts in English, and this sound developed into the simpler sibilant s.

What is C in C language?

What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.


2 Answers

No magic. Resizing happens if required.

Note that it is not always required. If the IEnumerable<T> being .ToArrayed also implements ICollection<T>, then the .Count property is used to pre-allocate the array (making the algorithm linear in space and time.) If not, however, the following (rough) code is executed:

    foreach (TElement current in source)
    {
        if (array == null)
        {
            array = new TElement[4];
        }
        else
        {
            if (array.Length == num)
            {
                // Doubling happens *here*
                TElement[] array2 = new TElement[checked(num * 2)];
                Array.Copy(array, 0, array2, 0, num);
                array = array2;
            }
        }
        array[num] = current;
        num++;
    }

Note the doubling when the array fills.

Regardless, it's generally a good practice to avoid calling .ToArray() and .ToList() unless you absolute require it. Interrogating the query directly when needed is often a better choice.

like image 86
dlev Avatar answered Sep 18 '22 15:09

dlev


I extracted the code behind .ToArray() method using .NET Reflector:

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source)
{
    if (source == null)
    {
        throw Error.ArgumentNull("source");
    }
    Buffer<TSource> buffer = new Buffer<TSource>(source);
    return buffer.ToArray();
}

and Buffer.ToArray:

internal TElement[] ToArray()
{
    if (this.count == 0)
    {
        return new TElement[0];
    }
    if (this.items.Length == this.count)
    {
        return this.items;
    }
    TElement[] destinationArray = new TElement[this.count];
    Array.Copy(this.items, 0, destinationArray, 0, this.count);
    return destinationArray;
}

And inside the Buffer constructor it loops through all elements to calculate the real Count and array of Elements.

like image 27
Mo Valipour Avatar answered Sep 20 '22 15:09

Mo Valipour