Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can A C Compiler Eliminate This Conditional Test At Runtime?

Let's say I have pseudocode like this:

main() {
    BOOL b = get_bool_from_environment(); //get it from a file, network, registry, whatever
    while(true) {
        do_stuff(b);
    }
}

do_stuff(BOOL b) {
    if(b)
        path_a();
    else
        path_b();
}

Now, since we know that the external environment can influence get_bool_from_environment() to potentially produce either a true or false result, then we know that the code for both the true and false branches of if(b) must be included in the binary. We can't simply omit path_a(); or path_b(); from the code.

BUT -- we only set BOOL b the one time, and we always reuse the same value after program initialization.

If I were to make this valid C code and then compile it using gcc -O0, the if(b) would be repeatedly evaluated on the processor each time that do_stuff(b) is invoked, which inserts what are, in my opinion, needless instructions into the pipeline for a branch that is basically static after initialization.

If I were to assume that I actually had a compiler that was as stupid as gcc -O0, I would re-write this code to include a function pointer, and two separate functions, do_stuff_a() and do_stuff_b(), which don't perform the if(b) test, but simply go ahead and perform one of the two paths. Then, in main(), I would assign the function pointer based on the value of b, and call that function in the loop. This eliminates the branch, though it admittedly adds a memory access for the function pointer dereference (due to architecture implementation I don't think I really need to worry about that).

Is it possible, even in principle, for a compiler to take code of the same style as the original pseudocode sample, and to realize that the test is unnecessary once the value of b is assigned once in main()? If so, what is the theoretical name for this compiler optimization, and can you please give an example of an actual compiler implementation (open source or otherwise) which does this?

I realize that compilers can't generate dynamic code at runtime, and the only types of systems that could do that in principle would be bytecode virtual machines or interpreters (e.g. Java, .NET, Ruby, etc.) -- so the question remains whether or not it is possible to do this statically and generate code that contains both the path_a(); branch and the path_b() branch, but avoid evaluating the conditional test if(b) for every call of do_stuff(b);.

like image 474
allquixotic Avatar asked Dec 05 '25 06:12

allquixotic


1 Answers

If you tell your compiler to optimise, you have a good chance that the if(b) is evaluated only once.

Slightly modifying the given example, using the standard _Bool instead of BOOL, and adding the missing return types and declarations,

_Bool get_bool_from_environment(void);
void path_a(void);
void path_b(void);

void do_stuff(_Bool b) {
    if(b)
        path_a();
    else
        path_b();
}

int main(void) {
    _Bool b = get_bool_from_environment(); //get it from a file, network, registry, whatever
    while(1) {
        do_stuff(b);
    }
}

the (relevant part of the) produced assembly by clang -O3 [clang-3.0] is

    callq   get_bool_from_environment
    cmpb    $1, %al
    jne     .LBB1_2
    .align  16, 0x90
.LBB1_1:                                # %do_stuff.exit.backedge.us
                                        # =>This Inner Loop Header: Depth=1
    callq   path_a
    jmp     .LBB1_1
    .align  16, 0x90
.LBB1_2:                                # %do_stuff.exit.backedge
                                        # =>This Inner Loop Header: Depth=1
    callq   path_b
    jmp     .LBB1_2

b is tested only once, and main jumps into an infinite loop of either path_a or path_b depending on the value of b. If path_a and path_b are small enough, they would be inlined (I strongly expect). With -O and -O2, the code produced by clang would evaluate b in each iteration of the loop.

gcc (4.6.2) behaves similarly with -O3:

    call    get_bool_from_environment
    testb   %al, %al
    jne     .L8
    .p2align 4,,10
    .p2align 3
.L9:
    call    path_b
    .p2align 4,,6
    jmp     .L9
.L8:
    .p2align 4,,8
    call    path_a
    .p2align 4,,8
    call    path_a
    .p2align 4,,5
    jmp     .L8

oddly, it unrolled the loop for path_a, but not for path_b. With -O2 or -O, it would however call do_stuff in the infinite loop.

Hence to

Is it possible, even in principle, for a compiler to take code of the same style as the original pseudocode sample, and to realize that the test is unnecessary once the value of b is assigned once in main()?

the answer is a definitive Yes, it is possible for compilers to recognize this and take advantage of that fact. Good compilers do when asked to optimise hard.

If so, what is the theoretical name for this compiler optimization, and can you please give an example of an actual compiler implementation (open source or otherwise) which does this?

I don't know the name of the optimisation, but two implementations doing that are gcc and clang (at least, recent enough releases).

like image 90
Daniel Fischer Avatar answered Dec 07 '25 20:12

Daniel Fischer