Performance of breaking apart one loop into two loops

Question

Good Day,

Suppose that you have a simple for loop like below...

for(int i=0;i<10;i++)
{
    //statement 1
    //statement 2
}

Assume that statement 1 and statement 2 were O(1). Besides the small overhead of "starting" another loop, would breaking down that for loop into two (not nested, but sequential) loops be as equally fast? For example...

for(int i=0;i<10;i++)
{
    //statement 1
}
for(int i=0;i<10;i++)
{
    //statement 2
}

Why I ask such a silly question is that I have a Collision Detection System(CDS) that has to loop through all the objects. I want to "compartmentalize" the functionality of my CDS system so I can simply call

cds.update(objectlist);

instead of having to break my cds system up. (Don't worry too much about my CDS implementation... I think I know what I am doing, I just don't know how to explain it, what I really need to know is if I take a huge performance hit for looping through all my objects again.

JoeG · Accepted Answer

In terms of algorithmic complexity splitting the loops makes no difference.

In terms of real world performance splitting the loops could improve performance, worsen performance or make no difference - it depends on the OS, hardware and - of course - what statement 1 and statement 2 are.

Matthieu M. · Answer

It depends on your application.

Possible Drawbacks (of splitting):

your data does not fit into the L1 data cache, therefore you load it once for the first loop and then reload it for the second loop

Possible Gains (of splitting):

your loop contains many variables, splitting helps reducing register/stack pressure and the optimizer turns it into better machine code
the functions you use trash the L1 instruction cache so the cache is loaded on each iteration, while by splitting you manage into loading it once (only) at the first iteration of each loop

These lists are certainly not comprehensive, but already you can sense that there is a tension between code and data. So it is difficult for us to take an educated/a wild guess when we know neither.

In doubt: profile. Use callgrind, check the cache misses in each case, check the number of instructions executed. Measure the time spent.

Unknown1987 · Answer

With two loops you will be paying for:

increased generated code size
2x as many branch predicts
depending what the data layout of statement 1 and 2 are you could be reloading data into cache.

The last point could have a huge impact in either direction. You should measure as with any perf optimization.

Performance of breaking apart one loop into two loops

Tags:

c++

loops

big-o

Matthew

Video Answer

3 Answers

JoeG

Matthieu M.

Unknown1987

Recent Activity

Donate For Us

Performance of breaking apart one loop into two loops

Tags:

c++

loops

big-o

Matthew

Video Answer

3 Answers

JoeG

Matthieu M.

Unknown1987

Related questions

Recent Activity

Donate For Us