I have the following cycle:
//condition will be set here to true or false
for (int i = 0; i < LARGE_NUMBER; i++) {
if (condition) {
//do foo
} else {
//do bar
}
}
Assumption: A cycle if faster without a condition than with a condition. (Is this true?)
Question: Will gcc factor out my if
, if condition
has been set outside the for cycle, and the cycle itself doesn't touch condition
?
If not, I should switch the if
and the for
, duplicate code, violate DRY, etc.
For those who don't wish to read a lengthy post, this optimization is called (in LLVM) Loop Unswitch.
Why not ask a compiler ?
void foo(char* c);
int main(int argc, char **argv) {
bool const condition = argc % 2;
for (int i = 0; i != argc; ++i) {
if (condition) {
foo(argv[1]);
} else {
foo(argv[0]);
}
}
return 0;
}
Is transformed into SSA form (via LLVM try out):
define i32 @main(i32 %argc, i8** nocapture %argv) {
entry:
%0 = icmp eq i32 %argc, 0 ; <i1> [#uses=1]
br i1 %0, label %bb5, label %bb.nph
bb.nph: ; preds = %entry
%1 = and i32 %argc, 1 ; <i32> [#uses=1]
%toBool = icmp eq i32 %1, 0 ; <i1> [#uses=1]
%2 = getelementptr inbounds i8** %argv, i64 1 ; <i8**> [#uses=1]
br i1 %toBool, label %bb3.us, label %bb3
bb3.us: ; preds = %bb3.us, %bb.nph
%i.07.us = phi i32 [ %4, %bb3.us ], [ 0, %bb.nph ] ; <i32> [#uses=1]
%3 = load i8** %argv, align 8 ; <i8*> [#uses=1]
tail call void @_Z3fooPc(i8* %3)
%4 = add nsw i32 %i.07.us, 1 ; <i32> [#uses=2]
%exitcond = icmp eq i32 %4, %argc ; <i1> [#uses=1]
br i1 %exitcond, label %bb5, label %bb3.us
bb3: ; preds = %bb3, %bb.nph
%i.07 = phi i32 [ %6, %bb3 ], [ 0, %bb.nph ] ; <i32> [#uses=1]
%5 = load i8** %2, align 8 ; <i8*> [#uses=1]
tail call void @_Z3fooPc(i8* %5)
%6 = add nsw i32 %i.07, 1 ; <i32> [#uses=2]
%exitcond8 = icmp eq i32 %6, %argc ; <i1> [#uses=1]
br i1 %exitcond8, label %bb5, label %bb3
bb5: ; preds = %bb3, %bb3.us, %entry
ret i32 0
}
Not too readable perhaps, so let me point out what's here:
entry
: check if argc
is equal to 0, if it is, go to bb5
(exit) else go to bb.nph
bb.nph
: compute the value of condition
, if it's true, go to bb3.us
else go to bb3
bb3.us
and bb3
: loops for the true and false condition respectivelybb5
: exitA compiler can pretty much transform your code how it wants, as long as the effect is similar to what you asked for. In this case, it has effectively rewritten the code as:
int main(int argc, char**argv) {
if (argc != 0)
{
int i = 0;
if (argc % 2) {
do {
foo(argv[1]);
++i;
} while (i != argc);
} else {
do {
foo(argv[0]);
++i;
} while (i != argc);
}
}
return 0;
}
It's a form of Loop Invariant Optimization, combined here with a first check to avoid computing the condition if the loop is not going to get executed.
For those of us who would think the first solution is clearer, we're quite happy to have the compiler do the nitty gritty optimization for us!
Any decent optimizing compiler will do this if condition
can be proved to not change during the iteration.
Furthermore, even if the compiler does not in fact do this, you would do well to support your decision to rewrite the code to a less-human-readable form with hard data from profiling. Don't optimize prematurely. It isn't worth it to give readers of the code a "huh?" moment in order to shave off a few milliseconds (and "readers" definitely includes yourself at a future time).
I wouldn't advocate to take any action here, by the usual "premature optimization" arguments. Keeping the code clear is most important, and if the whole program is too slow, you may want to profile and find the actual bottlenecks (that you usually can't guess) after the program has been thoroughly debugged.
Even if the compiler does not optimize this particular case for you, you may want to know that CPU do some form of branch prediction which will greatly reduce the time needed to process the condition in case the condition is predictable.
Indeed, most CPU process instructions in a pipeline and by the time the jump address has to be determined, the condition variable may be unknown. This would cause a pipeline stall, and this is where most modern processors try to guess (smartly in fact) where the program will jump. If the condition variable is indeed known (as it is in your case), the guess will be perfect.
So I doubt that even with a "dumb" compiler, you would actually see a difference between the two options, at least on modern machines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With