I have g++ 4.7.3 compiler. I'm trying to follow the optimisation flags description http://gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/Optimize-Options.html and have a next problem:
I have a program, which gives different times with -O2 and -O3 flag. -O2 is twice faster than -O3. Time is 8ms with O2 and 16ms with O3.
So I would like to understand what exactly makes difference. In the link above I see:
"O3 Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize and -fipa-cp-clone options."
So I simply take -O2 and add all described flags:
-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone
And time is 30ms. But this set of options should be equivalent to -O3. Why time is different? Where do I do something wrong?
P.S. All results are perfectly reproducible with precision of 1ms.
I have checked the options using
g++ -c -Q -Ox --help=optimizers
and saw that O3 has one more additional option: -ftree-loop-distribute-patterns. But when I add it the the options set:
-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone -ftree-loop-distribute-patterns
the speed is still 30ms.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
Order does matter when you use several options of the same kind; for example, if you specify -L more than once, the directories are searched in the order specified. Also, the placement of the -l option is significant.
-O3 instructs the compiler to optimize for the performance of generated code and disregard the size of the generated code, which might result in an increased code size. It also degrades the debug experience compared to -O2 .
The "-O3 in general is unsafe" guidance from Torvalds stems from this recent kernel thread where some developers had been discussing -O3 usage but passing as well "-fnotree-loop-vectorize" to workaround some versions of GCC possibly generating bad code at the higher optimization level.
You can get g++
to show you what options is active with the -Q
option:
g++ -c -Q -O3 --help=optimizers
The output is something like:
-O<number>
-Ofast
-Os
-falign-functions [enabled]
-falign-jumps [enabled]
-falign-labels [enabled]
-falign-loops [enabled]
-fasynchronous-unwind-tables [enabled]
-fbranch-count-reg [enabled]
-fbranch-probabilities [disabled]
-fbranch-target-load-optimize [disabled]
-fbranch-target-load-optimize2 [disabled]
-fbtr-bb-exclusive [disabled]
-fcaller-saves [enabled]
-fcombine-stack-adjustments [enabled]
-fcommon [enabled]
-fcompare-elim [enabled]
-fconserve-stack [disabled]
-fcprop-registers [enabled]
-fcrossjumping [enabled]
-fcse-follow-jumps [enabled]
-fcx-fortran-rules [disabled]
-fcx-limited-range [disabled]
-fdata-sections [disabled]
-fdce [enabled]
ETC..
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With