Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why some compilers optimize if(a>0) and not if(*(&a)>0)?

Let's say I have declared in the global scope:

const int a =0x93191;

And in the main function I have the following condition:

if(a>0)
    do_something

An awkward thing I have noticed is that the RVDS compiler will drop the if statement and there is no branch/jmp in the object file.

but If I write:

if(*(&a)>0)
    do_something

The if (cmp and branch) will be in the compiled object file.


In contrast, GCC do optimizes both with (-O1 or -O2 or -O3) :

#include <stdio.h>
const a = 3333;

int main()
{
    if (a >333)
        printf("first\n");

return 0;
}

compiled with -O3:

(gdb) disassemble main
Dump of assembler code for function main:
0x0000000100000f10 <main+0>:    push   %rbp
0x0000000100000f11 <main+1>:    mov    %rsp,%rbp
0x0000000100000f14 <main+4>:    lea    0x3d(%rip),%rdi        # 0x100000f58
0x0000000100000f1b <main+11>:   callq  0x100000f2a <dyld_stub_puts>
0x0000000100000f20 <main+16>:   xor    %eax,%eax
0x0000000100000f22 <main+18>:   pop    %rbp
0x0000000100000f23 <main+19>:   retq   
End of assembler dump.

And for

#include <stdio.h>
const a = 3333;

int main()
{
        if (*(&a) >333)
                printf("first\n");

return 0;
}

will give:

(gdb) disassemble main
Dump of assembler code for function main:
0x0000000100000f10 <main+0>:    push   %rbp
0x0000000100000f11 <main+1>:    mov    %rsp,%rbp
0x0000000100000f14 <main+4>:    lea    0x3d(%rip),%rdi        # 0x100000f58
0x0000000100000f1b <main+11>:   callq  0x100000f2a <dyld_stub_puts>
0x0000000100000f20 <main+16>:   xor    %eax,%eax
0x0000000100000f22 <main+18>:   pop    %rbp
0x0000000100000f23 <main+19>:   retq   
End of assembler dump.

GCC treat both as same (as should be) and RVDS doesn't ?


I tried to examine the affect of using volatile and in the RVDS it did drop the the if(a>333) but gcc didn't:

#include <stdio.h>
volatile const a = 3333;

int main()
{
    if (a >333)
        printf("first\n");

return 0;
}

(gdb) disassemble main
Dump of assembler code for function main:
0x0000000100000f10 <main+0>:    push   %rbp
0x0000000100000f11 <main+1>:    mov    %rsp,%rbp
0x0000000100000f14 <main+4>:    cmpl   $0x14e,0x12a(%rip)        # 0x100001048 <a>
0x0000000100000f1e <main+14>:   jl     0x100000f2c <main+28>
0x0000000100000f20 <main+16>:   lea    0x39(%rip),%rdi        # 0x100000f60
0x0000000100000f27 <main+23>:   callq  0x100000f36 <dyld_stub_puts>
0x0000000100000f2c <main+28>:   xor    %eax,%eax
0x0000000100000f2e <main+30>:   pop    %rbp
0x0000000100000f2f <main+31>:   retq   
End of assembler dump.

Probably there are some bugs in the compiler version I used of RVDS.

like image 356
0x90 Avatar asked Jun 20 '13 17:06

0x90


2 Answers

The level of complexity the compiler will go through to find out "is this something I can figure out what the actual value is", is not unbounded. If you write a sufficiently complex statement, the compiler will simply say "I don't know what the value is, I'll generate code to compute it".

This is perfectly possible for a compiler to figure out that it's not going to change. But it's also possible that some compilers "give up" in the process - it may also depends on where in the compilation chain this analysis is done.

This is probably a fairly typical example of "as-if" rule - the compiler is allowed to perform any optimisation that generates the result "as-if" this was executed.

Having said all that, this should be fairly trivial (and as per comments, the compiler should consdier *(&a) the same as a), so it seems strange that it then doesn't get rid of the comparison.

like image 151
Mats Petersson Avatar answered Nov 01 '22 08:11

Mats Petersson


Optimizations are implementation details of the compilers. It takes time and effort to implement them and compiler writers usually focus on the common uses of the language (i.e. the return of investment of optimizing code that is highly infrequent is close to nothing).

That being said there is a important difference in both pieces of code, in the first case a is not odr-used, only used as an rvalue and that means that it can be processed as a compile time constant. That is, when a is used directly (no address-of, no references bound to it) compilers immediately substitute the value in. The value must be known by the compiler without accessing the variable, since it could be used in contexts where constant expressions are required (i.e. defining the size of an array).

In the second case a is odr-used, the address is taken and the value at that location is read. The compiler must produce code that does those steps before passing the result to the optimizer. The optimizer in turn can detect that it is a constant and replace the whole operation with the value, but this is a bit more involved than the previous case where the compiler itself filled the value in.

like image 4
David Rodríguez - dribeas Avatar answered Nov 01 '22 06:11

David Rodríguez - dribeas