Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing a Loop vs Code Duplication

My dilemma concerns how to best handle long heavy loops which can accept parameters. Consider the following method:

void HeavyLoop(byte* startingAddress, bool secondaryModification)
{
    for (int i = 0; i < 10000000; i++)
    {
        byte* b = startingAddress + i;
        *b+= 1;
        if (secondaryModification) *b+= 2;
    }
}

This method will do what I want, but I am using 10000000 unnecessary ifs inside the loop.

Had I written the same method like this:

void HeavyLoop(byte* startingAddress, bool secondaryModification)
{
    if (secondaryModification)
    {
        for (int i = 0; i < 10000000; i++)
        {
            byte* b = startingAddress + i;
            *b+= 1;
            *b+= 2;
        }
    }
    else
    {
        for (int i = 0; i < 10000000; i++)
        {
            byte* b = startingAddress + i;
            *b+= 1;         
        }
    }   
}

I would get the same result, though my entire loop code would have to be duplicated. This is not a big deal if we're talking about one parameter, but when you have 4 independent parameters I would have to write 16 different versions of the loop.

What is the "correct" solution in cases like this? If this were a language like Python I could just dynamically build a function to handle the loop. Is there something similar in C++?

Needless to say, the code is only an example and not the actual case. Please don't give solutions pertaining to *b+=1 per se. I am a C++ novice so forgive me if there is a simple solution I am not aware of. Also forgive me if there are syntax errors, I don't have a compiler handy at the moment.

Edit: The issue is dealing with statements that can not be pre-calculated outside of the loop.

like image 363
Rotem Avatar asked Jan 10 '12 15:01

Rotem


1 Answers

You could implement the loop as a template; the template argument is a compile-time constant, so optimisation ought to remove the unwanted code when it's false. You then need a wrapper to allow the correct specialisation to be called based on a run-time value:

template <bool secondaryModification>
void HeavyLoop(byte* startingAddress)
{
    for (int i = 0; i < 10000000; i++)
    {
        byte* b = startingAddress + i;
        *b+= 1;
        if (secondaryModification) *b+= 2;
    }
}

void HeavyLoop(byte* startingAddress, bool secondaryModification)
{
    if (secondaryModification) {
        HeavyLoop<true>(startingAddress);
    } else {
        HeavyLoop<false>(startingAddress);
    }
}

During compilation, both versions of the template will be instantiated (one containing *b+=2; and one not, and neither performing a runtime test on the argument); they should then be inlined in the wrapper function to generate exactly the same code as your second example - but without the need to duplicate any source code.

like image 68
Mike Seymour Avatar answered Oct 11 '22 06:10

Mike Seymour