Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# performance curiosity

Tags:

performance

c#

Really curious for the below program (yes run in release mode without debugger attached), the first loop assigns a new object to each element of the array, and takes about a second to run.

So I was wondering which part was taking the most time--object creation or assignment. So I created the second loop to test the time required to create the objects, and the third loop to test assignment time, and both run in just a few milliseconds. What's going on?

static class Program
{
    const int Count = 10000000;

    static void Main()
    {
        var objects = new object[Count];
        var sw = new Stopwatch();
        sw.Restart();
        for (var i = 0; i < Count; i++)
        {
            objects[i] = new object();
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds); // ~800 ms
        sw.Restart();
        object o = null;
        for (var i = 0; i < Count; i++)
        {
            o = new object();
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds); // ~ 40 ms
        sw.Restart();
        for (var i = 0; i < Count; i++)
        {
            objects[i] = o;
        }
        sw.Stop();
        Console.WriteLine(sw.ElapsedMilliseconds); // ~ 50 ms
    }
}
like image 605
lobsterism Avatar asked Aug 10 '13 18:08

lobsterism


People also ask

What is C language w3schools?

C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is C in coding language?

C is a powerful general-purpose programming language. It can be used to develop software like operating systems, databases, compilers, and so on. C programming is an excellent language to learn to program for beginners. Our C tutorials will guide you to learn C programming one step at a time.


2 Answers

When an object which occupies less than 85,000 bytes of RAM and is not an array of double is created, it is placed in an area of memory called the Generation Zero heap. Every time the Gen0 heap grows to a certain size, every object in the Gen0 heap to which the system can find a live reference is copied to the Gen1 heap; the Gen0 heap is then bulk-erased so it has room for more new objects. If the Gen1 heap reaches a certain size, everything there to which a reference exists will be copied to the Gen2 heap, whereupon the Gen0 heap can be bulk-erased.

If many objects are created and immediately abandoned, the Gen0 heap will repeatedly fill up, but very few objects from the Gen0 heap will have to be copied to the Gen1 heap. Consequently, the Gen1 heap will be filled very slowly, if at all. By contrast, if most of the objects in the Gen0 heap are still referenced when the Gen0 heap gets full, the system will have to copy those objects to the Gen1 heap. This will force the system to spend time copying those objects, and may also the Gen1 heap to fill up enough that it will have to be scanned for live objects, and all the live objects from there will have to be copied again to the Gen2 heap. All this takes more time.

Another issue which slows things in your first test is that when trying to identify all live Gen0 objects, the system can ignore any Gen1 or Gen2 objects only if they haven't been touched since the last Gen0 collection. During the first loop, the objects array will be touched constantly; consequently, every Gen0 collection will have to spend time processing it. During the second loop, it's not touched at all, so even though there will be just as many Gen0 collections they won't take as long to perform. During the third loop, the array will be touched constantly, but no new heap objects are created, so no garbage-collection cycles will be necessary and it won't matter how long they would take.

If you were to add a fourth loop which created and abandoned an object on each pass, but which also stored into an array slot a reference to a pre-existing object, I would expect that it would take longer than the combined times of the second and third loops even though it would be performing the same operations. Not as much time as the first loop, perhaps, since very few of the newly-created objects would need to get copied out of the Gen0 heap, but longer than the second because of the extra work required to determine which objects were still live. If you want to probe things even further, it might be interesting to do a fifth test with a nested loop:

for (int ii=0; ii<1024; ii++)
  for (int i=ii; i<Count; i+=1024)
     ..

I don't know the exact details, but .NET tries to avoid having to scan entire large arrays of which only a small part is touched by subdividing them into chunks. If a chunk of a large array is touched, all references within that chunk must be scanned, but references stored in chunks which haven't been touched since the last Gen0 collection may be ignored. Breaking up the loop as shown above might cause .NET to end up touching most of the chunks in the array between Gen0 collections, quite possibly yielding a slower time than the first loop.

like image 72
supercat Avatar answered Oct 12 '22 10:10

supercat


  1. You create 10 million objects and store them in separate locations in memory. Memory consumption is highest here.
  2. You create 10 million objects, but they are not stored anywhere, just discarded.
  3. You create 1 object and make 10 million references to it, minimal memory consumption.

And yes, performance analisys bellow is for only 10 thousands of objects (10 million would take toooo long).

Performance for ONLY 10 thousand objects

UPDATE: this diagram shows CPU work for memory allocation in first case. Notice JIT_New@@... function taking 80.5% of CPU time.

CPU performance case 1

UPDATE2: and for completeness CPU time for CaseTwo.

CPU performance case 2

UPDATE3: Just for completeness, third case

CPU performance case 3

like image 38
Nenad Avatar answered Oct 12 '22 08:10

Nenad