Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple nested for loops vs single for loop [closed]

I was doing a little speed testing in c++ (MSVS) and got a strange result. I was testing the speed of using one for loop vs multiple nested for loops. Here is the code:

double testX = 0;
// Single loop executes in roughly 0.04 seconds
for( int i = 0; i < 27000000; i++ ){
    testX += 1;
}

// Nested loop executes in roughly 0.03 seconds
for( int x = 0; x < 300; x++ ){
    for( int y = 0; y < 300; y++ ){
        for( int z = 0; z < 300; z++ ){
            testX += 1;
        }
    }
}

As you can see, the speed difference is fairly obvious. I have run this many times, and those are the average times I am seeing (these are timed using glfwGetTime()).

So my question is: why? Is my test method inadequate? Am I using too few loops? I have tried searching google, and the only similar question I could find related his problem to cache coherency, but since these are empty for loops, I didn't think it would really have an effect.

Any help is appriciated :)

Edit: Thanks to the comments, I realized that using empty for loops probably wasn't the best way of testing things... So I have updated my code to do some (very) simple operations to a double. I also am compiling in release mode. However, though the two methods are a lot more similar in times, the second method is still slightly faster.

And yes, this is all the test code (minus the timing/output functions, but those aren't exactly specific to the question).

like image 772
Jamie Syme Avatar asked Sep 14 '12 04:09

Jamie Syme


People also ask

Is single loop faster than nested loop?

Nested loops are not faster than single loops.

How many nested for loops is too many?

Microsoft BASIC had a nesting limit of 8. @Davislor: The error message refers to the stack size in the compiler, which is also a program, and which is recursively processing the nested-looped construct.

Is it bad to have nested for loops?

Nested loops are frequently (but not always) bad practice, because they're frequently (but not always) overkill for what you're trying to do. In many cases, there's a much faster and less wasteful way to accomplish the goal you're trying to achieve.

Which of the nested for loop will run faster?

Most likely neither are measurably faster and they may get optimized to the same exact code depending on the compiler.


2 Answers

The compiler won't "optimize" the loops away when the testX variable is used somewhere later in the code. When I just add one line to the code to output testX, the results are as follows:

  • single for loop: 1.218 ms
  • nested for loop: 1.218 ms

This pretty much shows that the compiler converts the nested loop into a single loop, whenever possible. The loop index can be used to prevent that optimisation:

Modifying the code this way

for( int i = 0; i < 27000000; i++ ){
    testX += i;
}

and

for( int x = 0; x < 300; x++ ){
    testX += x;
    for( int y = 0; y < 300; y++ ){
        testX += y;
        for( int z = 0; z < 300; z++ ){
            testX += z;
        }
    }
}

will add a little overhead to the nested loop but the execution time shows

  • single for loop: 1.224 ms
  • nested for loop: 1.226 ms

The times given here are averaged over 30.000 loop runs.

Note: The testX += x; only contributes 1 in 90000 and the testX += x; only contributes 1 in 300. Thus the two sections above remain comparable.

Nested loops are not much slower than single loops but your observation that they are faster is not true.

And: The times you show are about 40 times the times I observed. I'd suggest to carefully inspect the compiler settings since I ran the test on a medium speed hardware. Maybe the results of glfwGetTime() are questionable and this is the main reason for your question. Have you tried to use another timing scheme?

Edit: In order to prevent compiler optimisation the loop limit can be choosen to be non constant:

int lmt = rand() % 1 + 300;      // random value 300 or 301 
int big_lmt = lmt * lmt * lmt;   // random value 27000000 or 27270901

for( int i = 0; i < big_lmt; i++ ){
    testX += i;
}

for( int x = 0; x < lmt; x++ ){
    testX += x;
    for( int y = 0; y < lmt; y++ ){
        testX += y;
        for( int z = 0; z < lmt; z++ ){
            testX += z;
        }
    }
}

This avoids compiler predictability.

Results (for a lmt = 300 case to be comparable):

  • single for loop: 1.213 ms
  • nested for loop: 1.216 ms

Result:

  • Nested loops are not faster than single loops.
like image 74
Arno Avatar answered Oct 03 '22 09:10

Arno


If you don't use your for variables (x,y,z) inside your for loop, a smart compiler could (and should) convert your second form in a single for loop without nesting. Unless you prevent such compiler optimization removing static predictability by having the user input the x,y,z values at runtime from stdin, or reading from some stream, etc..

Furthermore, if you don't do something with your testX variable (say, printing it to stdout), a smart compiler could (and should) optimize it away, i.e. remove the dead code altogether.

So what I'm saying is that the benchmark, as it is right now, is somehow ill-formed.

like image 36
Unai Vivi Avatar answered Oct 03 '22 11:10

Unai Vivi