Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow writing to array in C++

I was just wondering if this is expected behavior in C++. The code below runs at around 0.001 ms:

for(int l=0;l<100000;l++){
        int total=0;
        for( int i = 0; i < num_elements; i++) 
        {
            total+=i;
        }
    }

However if the results are written to an array, the time of execution shoots up to 15 ms:

int *values=(int*)malloc(sizeof(int)*100000);
        for(int l=0;l<100000;l++){
            int total=0;
            for( unsigned int i = 0; i < num_elements; i++) 
            {
                total+=i;
            }
            values[l]=total;
        }

I can appreciate that writing to the array takes time but is the time proportionate?

Cheers everyone

like image 266
Ljdawson Avatar asked Feb 10 '10 02:02

Ljdawson


3 Answers

The first example can be implemented using just CPU registers. Those can be accessed billions of times per second. The second example uses so much memory that it certainly overflows L1 and possibly L2 cache (depending on CPU model). That will be slower. Still, 15 ms/100.000 writes comes out to 1.5 ns per write - 667 Mhz effectively. That's not slow.

like image 107
MSalters Avatar answered Oct 17 '22 14:10

MSalters


It looks like the compiler is optimizing that loop out entirely in the first case.

The total effect of the loop is a no-op, so the compiler just removes it.

like image 10
Anon. Avatar answered Oct 17 '22 14:10

Anon.


It's very simple. In first case You have just 3 variables, which can be easily stored in GPR (general purpose registers), but it doesn't mean that they are there all the time, but they are probably in L1 cache memory, which means thah they can be accessed very fast.

In second case You have more than 100k variables, and You need about 400kB to store them. That is deffinitely to much for registers and L1 cache memory. In best case it could be in L2 cache memory, but probably not all of them will be in L2. If something is not in register, L1, L2 (I assume that your processor doesn't have L3) it means that You need to search for it in RAM and it takes muuuuuch more time.

like image 3
Tomek Tarczynski Avatar answered Oct 17 '22 14:10

Tomek Tarczynski