Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In what conditions is one comparison for "if" and "if-else" at С as at assembly?

For x86 assembly the "cmp" instruction sets two flags: "ZF" and "CF", allowing to determine if two integers are equal or greater or less with a single comparison. How should the code be written in C to perform only one comparison for all three cases? 6 options are possible:

if (x > y) { /*code1*/ } else if (x < y) { /*code2*/ } else { /*code3*/ }

if (x < y) { /*code2*/ } else if (x > y) { /*code1*/ } else { /*code3*/ }

if (x > y) { /*code1*/ } else if (x == y) { /*code3*/ } else { /*code2*/ }

if (x < y) { /*code2*/ } else if (x == y) { /*code3*/ } else { /*code1*/ }

if (x == y) { /*code3*/ } else if (x < y) { /*code2*/ } else { /*code1*/ }

if (x == y) { /*code3*/ } else if (x > y) { /*code1*/ } else { /*code2*/ } 
like image 443
Imyaf Avatar asked Dec 22 '25 06:12

Imyaf


2 Answers

Generally, what we do in C is that we code a regular, naive, seemingly inefficient if/else-if/else statement, and we expect the compiler to optimize it.

So, if both x and y can be known by the compiler to be simple values that do not require re-evaluation, we can code the following construct in C:

if( x > y )
    { /* code1 */ }
else if( x < y )
    { /* code2 */ }
else /* x == y */
    { /* code3 */ }

and the generated optimized assembly should look more or less like the following:

    mov eax, [x]
    cmp eax, [y]
    jg code1
    jl code2
    /* code3 */
    jmp after
code1:
    /* code1 */
    jmp after
code2:
    /* code2 */
after:

Note that in the naive C code the variables x and y are accessed twice each, and compared twice, whereas in the optimized assembly code they are loaded and compared only once.

Here is the source code and the generated assembly on godbolt:

https://godbolt.org/z/8YxT5Kh7P

The instructions of interest are the following:

    cmp     eax, ebp
    jg      .L7
    jl      .L8

(Here, eax contains y, and ebp contains x.)


The technique described above applies to all 6 cases listed in the question, as long as x and y are simple values.

If x or y require re-evaluation, (for example, if they are function calls,) then we need a slightly different technique:

int xx = x(), yy = y();
if( xx > yy )
    { /* code1 */ }
else if( xx < yy )
    { /* code2 */ }
else /* xx == yy, meaning that x() == y() */
    { /* code3 */ }

Note that the variables xx and yy are likely to be completely optimized away by the compiler, resulting in optimized assembly code very similar to what was shown above.


These were examples of the widespread and well-established practice of writing naive constructs in C code kind of expecting the compiler to optimize them in a certain way. However, in many cases the compiler decides to do different things that we may not have expected.

So, if you get into the habit of checking whether the compiler did in fact do exactly as you expected it to do, be prepared to sometimes be surprised.

like image 141
Mike Nakis Avatar answered Dec 23 '25 20:12

Mike Nakis


I think you should let the compiler decide the best way. Therefore, you should choose the most readable option.

If you are unsure you can check it also with the 'Compiler Explorer'

I have made an example here: https://godbolt.org/z/v81z3rfMa

like image 43
jokn Avatar answered Dec 23 '25 20:12

jokn