Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A big loop within a small loop always faster than a small loop within a big one?

I just read this post, and wonder if we can draw the conclusion that a big loop within a small loop must always run faster than a small loop within a big one, no matter what the code does inside the nested loop? Take an example.

int m, n; 
m = 1000000;
n = 10;

Snippet A

for (int i = 0; i < n; i++)         
    for (int j=0; j < m; j++)               
       {       
           DoSomething();        
       }

   

Snippet B

for (int j = 0; j < m; j++)               
    for (int i=0; i < n; i++)           
       {       
          DoSomething();          
       }

   

Can we say that, no matter what DoSomething() actually does, snippet A always runs faster thant snippet B?


As pointed out by @stackmate, I want to expand this question into two

  1. When the code inside nested loop is DoSomething() which means DoSomething() has nothing to do with variable i and j. What is the performance difference?

  2. When the code inside nested loop is DoSomething(i, j) which means DoSomething(i, j) has relateship with variable i and j. What is the performance difference?

like image 225
Wallace Avatar asked May 28 '14 14:05

Wallace


2 Answers

There cannot be a specific answer to your question. The parameter deciding whether it will be fast or not is what you are doing inside the loops. For example say you are adding 2 arrays and storing them in a third array:

Code 1:
for(int i = 0; i < 1000; i++)
{
    for(int j = 0; j < 1000000; j++)
         C[i][j] = A[i][j] + B[i][j];
}

Code 2:
for(int i = 0; i < 1000000; i++)
{
    for(int j = 0; j < 1000; j++)
         C[j][i] = A[j][i] + B[j][i];
}

Code 1 will be much faster than code 2. The reason is cache. Take a look at this question for more details. The answers are superbly informative and there is no point in me explaining the concept of cache again over here.

like image 157
Cool_Coder Avatar answered Oct 15 '22 00:10

Cool_Coder


@Cool_Coder already covered one main reason (memory access patterns resulting in better cache hit rates) why having the smaller loop as the inside loop can be beneficial.

Another scenario is that the inner loop could be unrolled. Particularly if the size of the smaller loop is really small and fixed, the compiler will unroll the inner loop if it's beneficial. The resulting code will then have only one loop instead of two nested loops, with a reduction in branches.

If you have a situation like this in highly performance critical code, you need to try both, and benchmark carefully. If the code is not very performance critical, it's probably not worth worrying about.

like image 25
Reto Koradi Avatar answered Oct 15 '22 00:10

Reto Koradi