C++ massive performance loss because of if statement

Tags:

I am running while loop in 4 thread, in the loop I am evaluating function and incrementally increasing counter.

while(1) {
    int fitness = EnergyFunction::evaluate(sequence);

    mutex.lock();
    counter++;
    mutex.unlock();
}

When I run this loop, as I said in 4 running threads, I get ~ 20 000 000 evaluations per second.

while(1) {
    if (dist(mt) == 0) {
        sequence[distDim(mt)] = -1;
    } else {
        sequence[distDim(mt)] = 1;
    }
    int fitness = EnergyFunction::evaluate(sequence);

    mainMTX.lock();
    overallGeneration++;
    mainMTX.unlock();
}

If I add some random mutation for the sequence, I get ~ 13 000 000 evaluations per second.

while(1) {
    if (dist(mt) == 0) {
        sequence[distDim(mt)] = -1;
    } else {
        sequence[distDim(mt)] = 1;
    }
    int fitness = EnergyFunction::evaluate(sequence);

    mainMTX.lock();
    if(fitness < overallFitness)
        overallFitness = fitness;

    overallGeneration++;
    mainMTX.unlock();
}

But when I add simple if statement that checks, if new fitness is smaller than old fitness if that is true then replace old fitness with new fitness.

But performance loss is massive! Now I get ~ 20 000 evaluations per second. If I remove random mutation part, I also get ~ 20 000 evaluations per second.

Variable overallFitness is declared as

extern int overallFitness;

I am having troubles figuring out what is the problem for such a big performance loss. Is comparing two int such time taking operation?

Also I don't believe that is related to mutex locking.

UPDATE

This performance loss was not because of branch prediction, but compiler just ignored this call int fitness = EnergyFunction::evaluate(sequence);.

Now I added volatile and compiler doesn't ignore the call anymore.

Also thank you for pointing out branch misprediction and atomic<int>, didn't know about them!

Because of atomic I also remove mutex part, so the final code looks like this:

while(1) {
    sequence[distDim(mt)] = lookup_Table[dist(mt)];
    fitness = EnergyFunction::evaluate(sequence);
    if(fitness < overallFitness)
       overallFitness = fitness;
    ++overallGeneration;
}

Now I am getting ~ 25 000 evaluations per second.

321

asked Oct 29 '15 13:10

TomazStoiljkovic

Video Answer

2 Answers

You need to run a profiler to get to the bottom of this. On Linux, use perf.

My guess is that EnergyFunction::evaluate() is being entirely optimized away, because in the first examples, you don't use the result. So the compiler can discard the whole thing. You can try writing the return value to a volatile variable, which should force the compiler or linker to not optimize the call away. 1000x speed up is definitely not attributable to a simple comparison.

178

answered Oct 19 '22 02:10

Mark Lakata

There is actually an atomic instruction to increase an int by 1. So a smart compiler may be able to entirely remove the mutex, altough I'd be surprised if it did. You can test this by looking at the assembly, or by removing the mutex and changing the type of overallGeneration to atomic<int> an check how fast it still is. This optimization is no longer possible with your last, slow example.

Also, if the compiler can see that evaluate does nothing to the global state and the result isn't used, then it can skip the entire call to evaluate. You can find out if that's the case by looking at the assembly or by removing the call to EnergyFunction::evaluate(sequence) and look at the timing - if it doesn't speed up, the function wasn't called in the first place. This optimization is no longer possible with your last, slow example. You should be able to stop the compiler from not executing EnergyFunction::evaluate(sequence) by defining the function in a different object file (other cpp or library) and disabling link time optimization.

There are other effects here that also create a performance difference, but I can't see any other effects that can explain a difference of factor 1000. A factor 1000 usually means the compiler cheated in the previous test and the change now prevents it from cheating.

answered Oct 19 '22 02:10

Peter

Related questions
                            
                                When would you use an array rather than a vector/string?
                            
                                Capturing a time in milliseconds
                            
                                Multiply char by integer (c++)
                            
                                detecting the memory page size
                            
                                Why mkdir fails to work with tilde (~)?
                            
                                Why does the C++ boost package only contain .hpp files?
                            
                                A pointer to a bound function may only be used to call the function
                            
                                Are `x = &v` and `*x = v` equivalent?
                            
                                What's the C++ GUI building option with the easiest learning curve - VS/Qt/wxWidgets/etc.?
                            
                                Why is my destructor never called?
                            
                                Two '==' equality operators in same 'if' condition are not working as intended
                            
                                Is it wrong to use C++ 'using' keyword in a header file?
                            
                                Testing if given number is integer
                            
                                How to save and load a QJsonDocument to a file?
                            
                                Using a map to find subarray with given sum (with negative numbers)
                            
                                wchar_t pointer
                            
                                Is there a shorter way to initialize a QByteArray?
                            
                                Visual Studio not able to show the value of 'this' in release mode (with debug information)
                            
                                strcmp or string::compare?
                            
                                what does the .f in 1000.f mean? c++ [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

C++ massive performance loss because of if statement

Tags:

c++

optimization

multithreading