I have some legacy code that compiles with both -02 and -03 set. From the GCC man file I get the guarantee that:
-O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload and -ftree-vectorize options.
So, at first glance it would seem likely that turning both of these flags on would be the same as just -O3. However, that got me thinking is that the right thing to do in that case as -O2 is probably the "safer" option. Obviously, it is a simple matter compile some code with all of the permutations and see what happens in each case, but I was wondering if anyone knows if there is a specific policy that GCC has in regard to specifying multiple optimizations levels and if so what is the reasoning behind it?
To be pedantic, there are 8 different valid -O options you can give to gcc, though there are some that mean the same thing.
-Ofast is a little bit faster than -O3 because it generates the approximate rsqrtps and rcpps instructions.
Optimization level -O3 -O3 instructs the compiler to optimize for the performance of generated code and disregard the size of the generated code, which might result in an increased code size. It also degrades the debug experience compared to -O2 .
Pragmas are implementation specific but, in this case (gcc), it sets the optimisation level to 3 (high), similar in effect to using -O3 on the command line. Details on optimisation levels for gcc , and the individual flags that get set in response, can be found here.
From the man page:
If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.
For over-concerned users like my self, here is a code begging for optimization:
$ cat dominant_flag.c #include <stdio.h> int foo(int i) { return 3*i+122; } int main(int argc, char **argv) { return foo(0xface); // meant to be optimized out }
And Here are four compilation scenarios:
$ gcc -g -O0 dominant_flag.c -o flag0 $ gcc -g -O3 dominant_flag.c -o flag3 $ gcc -g -O0 -O3 dominant_flag.c -o flag03 $ gcc -g -O3 -O0 dominant_flag.c -o flag30
Once I look for the constant 0xface
, I see it exists in the non optimized versions:
$ objdump -S -D flag0 | grep -w "\$0xface" # 61e: bf ce fa 00 00 mov $0xface,%edi $ objdump -S -D flag30 | grep -w "\$0xface" # 61e: bf ce fa 00 00 mov $0xface,%edi
and optimized out in the optimized versions:
$ objdump -S -D flag3 | grep -w "\$0xface" $ objdump -S -D flag03 | grep -w "\$0xface"
The whole foo call is gone:
$ objdump -S -D flag03 | sed -n "297,298p;299q" 4f0: b8 e4 f0 02 00 mov $0x2f0e4,%eax # <--- hex(3*0xface+122)=0x2f0e4 4f5: c3 retq
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With