Helping the compiler to optimize branchy code sequences

Tags:

I have code sequences in C/C++ that contain lots of branches, something like this:

if( condition1 )
    return true;
if( condition2 )
    return true;
...
return false;

(which is equivalent to return condition1 || condition2 || ...;)

Evaluating each of the conditions requires several memory accesses (all read-only) but the compiler misses an important optimization opportunity by not moving memory accesses before a previous condition is evaluated. The reason being that condition2's memory accesses may segfault when condition1 is true. Well I know that they don't and I would like the compiler to do the sensible thing and mix some of these code sequences where it is appropriate for performance, e.g. to exploit instruction-level-parallelism. I also don't want to change the condition to a logical or (not short-circuit) because one of the branches will likely jump out.

Any ideas on how this can be accomplished (preferably using gcc)?

Thanks.

465

asked Jun 09 '11 08:06

hans

2 Answers

Evaluating each of the conditions requires several memory accesses

Why don't you avoid short-circuit evaluation within individual conditions, but let it happen for the or-ing of conditions?

Using operators that are not short-circuit for built-in types

Exactly how you achieve the former depends on the nature of those conditions (i.e. condition1, condition2 in your code) - given you show nothing about them I can only talk in generalities: where they internally contain short-circuit operators, instead convert the boolean value to an integer representation and use e.g. bitwise-OR (or even '+' or '*' if it reads better and works in your particular usage). Bitwise operators are generally safer as they're lower precedence though - only have to be careful if your conditions already include bitwise operators.

To illustrate:

OLD: return (a > 4 && b == 2 && c < a) ||   // condition1
            (a == 3 && b != 2 && c == -a);  // condition2

NEW: return (a > 4 & b == 2 & c < a) ||
            (a == 3 & b != 2 & c == -a);

Also be careful if you used implicit conversion of numbers/pointers to bool before... you want to normalise them to bools so their least-significant bits reflect their boolean significance:

OLD: return my_int && my_point && !my_double;
NEW: return bool(my_int) & bool(my_point) & !my_double;  // ! normalises before bitwise-&

You might also want to benchmark with...

     bool condition1 = a > 4 & b == 2 & c < a;
     bool condition2 = a == 3 & b != 2 & c == -a;
     return condition1 || condition2;

...which might be faster - possibly only in the overall "return false" case and perhaps when the last conditionN or two are the deciding factor in a "return true".

User-defined operators avoid short-circuit evaluation

Seperately, short-circuit evaluation is disabled for objects with overloaded logical operators, which provides another avenue for you to do your checks using the existing notation, but you'll have to change or enhance your data types.

Thoughts

More generally, you'll only benefit from this if you've a large number of assertions combined in each individual condition - more so if the function tends to run through to return false.

"AProgrammer" makes a great point too - with speculative execution available on modern CPUs, the CPU may already be ahead of the ordering implied by short-circuit evaluation (in some special mode than either avoids or suppresses any memory faults from dereferencing invalid pointers, divides by 0 etc). So, it's possible that the entire attempt at optimisation may prove pointless or even counterproductive. Benchmarking of all alternatives is required.

154

answered Oct 29 '22 22:10

Tony Delroy

Could you just move the parts of the condition yourself?

 const bool bCondition1Result = <condition1>;
 const bool bCondition2Result = <condition2>;

and so on

For better optimisation still ... re-jig the order of your conditions so that the most hit one is the first to be checked. This way it will early out more often than not (This may make precious little difference).

answered Oct 29 '22 23:10

Goz

Related questions
                            
                                gcc and clang disagree on whether expression is constant evaluated
                            
                                Fully thread-safe shared_ptr implementation
                            
                                Boost - cross compile - "from Linux" "to Windows"
                            
                                Using mtrace for c++
                            
                                C++ return type overload hack
                            
                                Long constructor initialization lists
                            
                                Is FC++ used by any open source projects?
                            
                                Dll Memory Management
                            
                                Why do I get cc1plus: error: unrecognized command line option "-arch"?
                            
                                How do you compile/build/execute a C++ project in Geany?
                            
                                how boost::~shared_ptr works?
                            
                                No debugging symbols in gdb even when compiling with -g flag
                            
                                Can we have an anonymous struct as template argument?
                            
                                When to use pointer to a class and when to just instantiate it as a variable
                            
                                Lzz (Lazy C++) - #include file not found
                            
                                Sending Protobuf Messages with boost::asio
                            
                                Create a text file and write to it in C++?
                            
                                gdb catchpoint for uncaught C++ exceptions only?
                            
                                What is a Sandbox
                            
                                Downcasting shared pointer to derived class with additional functionality - is this safe?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With