In order to know how exactly the gcc do the optimization, I have written two program compiling with -O2, but there is some difference of the assembly code. In my programs, I want to output "hello" in the loop, and add some delay between each output. These two programs are only for illustrating my question, and I know I can using volatile or asm in program 1 to achieve my purpose.
Program 1
#include <stdio.h>
int main(int argc, char **argv)
{
unsigned long i = 0;
while (1) {
if (++i > 0x1fffffffUL) {
printf("hello\n");
i = 0;
}
}
}
Compile with -O2, the assembly code is:
Disassembly of section .text.startup:
00000000 <_main>:
#include <stdio.h>
int main(int argc, char **argv)
{
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 e4 f0 and $0xfffffff0,%esp
6: 83 ec 10 sub $0x10,%esp
9: e8 00 00 00 00 call e <_main+0xe>
e: 66 90 xchg %ax,%ax
10: c7 04 24 00 00 00 00 movl $0x0,(%esp)
17: e8 00 00 00 00 call 1c <_main+0x1c>
1c: eb f2 jmp 10 <_main+0x10>
1e: 90 nop
1f: 90 nop
Program 2
int main(int argc, char **argv)
{
unsigned long i = 0;
while (1) {
if (i > 0x1fffffffUL) {
printf("hello\n");
i = 0;
}
i++;
}
}
Compile with -O2, the assembly code is:
Disassembly of section .text.startup:
00000000 <_main>:
#include <stdio.h>
int main(int argc, char **argv)
{
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 e4 f0 and $0xfffffff0,%esp
6: 83 ec 10 sub $0x10,%esp
9: e8 00 00 00 00 call e <_main+0xe>
e: 31 c0 xor %eax,%eax
10: 83 c0 01 add $0x1,%eax
13: 3d ff ff ff 1f cmp $0x1fffffff,%eax
18: 76 f6 jbe 10 <_main+0x10>
1a: c7 04 24 00 00 00 00 movl $0x0,(%esp)
while (1) {
if (i > 0x1fffffffUL) {
printf("hello\n");
i = 0;
}
i++;
21: e8 00 00 00 00 call 26 <_main+0x26>
int main(int argc, char **argv)
{
unsigned long i = 0;
while (1) {
if (i > 0x1fffffffUL) {
26: 31 c0 xor %eax,%eax
28: eb e6 jmp 10 <_main+0x10>
printf("hello\n");
2a: 90 nop
2b: 90 nop
2c: 90 nop
2d: 90 nop
2e: 90 nop
2f: 90 nop
In program 1, the increase of i
is optimized out, but it's not in program 2. Why this happens? What rules is gcc using when optimizing with -O2 for these two programs?
The compiler optimizes to reduce the size of the binary instead of execution speed. If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code.
Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.
After the file is preprocessed, gcc moves it to the compiler. The compiler turns each line in the preprocessed file into assembly language, which are instructions in English mnemonics that have strong one-to-one correspondence to the machine code that computers can read.
The C/C++ compiler compiles each source file separately and produces the corresponding object file. This means the compiler can only apply optimizations on a single source file rather than on the whole program. However, some important optimizations can be performed only by looking at the whole program.
Asking "why" about optimizers is usually a waste of time, because there are no "rules" by which optimizers operate -- other than "as if": The optimizer may not change the observable behaviour of conforming code.
The "observable behaviour" of both your programs is to print "hello" repeatedly.
In your first program, the counting is optimized away, making the observable behaviour happen faster. That is the job of an optimizer. Be happy your code is more efficient now!
In your second program, the counting is not optimized away, because somehow the optimizer -- in this version of this compiler with this setting -- did not see that it could do without it. Why? Who knows (other than the maintainer of the compiler's optimizer module)?
If your desired behaviour is to have a delay between outputs, use something like thrd_sleep(). Empty count loops were a way to delay BASIC 2.0 programs on the C64, but they should not be used in C, for the exact reason you just observed: You never know what the optimizer does.
The branching in the if statement now depends on something that happened in the previous iteration of the loop. In particular, the compiler can easily determine in program 1 that i
is incremented in every iteration of the while loop (as it is right at the top), while this is not the case in program 2.
Anyway, compiler optimizations are very complicated. See below:
gcc -O2 is a shortcut for these flags: (from the documentation)
-fauto-inc-dec
-fbranch-count-reg
-fcombine-stack-adjustments
-fcompare-elim
-fcprop-registers
-fdce
-fdefer-pop
-fdelayed-branch
-fdse
-fforward-propagate
-fguess-branch-probability
-fif-conversion2
-fif-conversion
-finline-functions-called-once
-fipa-pure-const
-fipa-profile
-fipa-reference
-fmerge-constants
-fmove-loop-invariants
-freorder-blocks
-fshrink-wrap
-fsplit-wide-types
-fssa-backprop
-fssa-phiopt
-ftree-bit-ccp
-ftree-ccp
-ftree-ch
-ftree-coalesce-vars
-ftree-copy-prop
-ftree-dce
-ftree-dominator-opts
-ftree-dse
-ftree-forwprop
-ftree-fre
-ftree-phiprop
-ftree-sink
-ftree-slsr
-ftree-sra
-ftree-pta
-ftree-ter
-funit-at-a-time
-fthread-jumps
-falign-functions -falign-jumps
-falign-loops -falign-labels
-fcaller-saves
-fcrossjumping
-fcse-follow-jumps -fcse-skip-blocks
-fdelete-null-pointer-checks
-fdevirtualize -fdevirtualize-speculatively
-fexpensive-optimizations
-fgcse -fgcse-lm
-fhoist-adjacent-loads
-finline-small-functions
-findirect-inlining
-fipa-cp
-fipa-cp-alignment
-fipa-sra
-fipa-icf
-fisolate-erroneous-paths-dereference
-flra-remat
-foptimize-sibling-calls
-foptimize-strlen
-fpartial-inlining
-fpeephole2
-freorder-blocks-algorithm=stc
-freorder-blocks-and-partition -freorder-functions
-frerun-cse-after-loop
-fsched-interblock -fsched-spec
-fschedule-insns -fschedule-insns2
-fstrict-aliasing -fstrict-overflow
-ftree-builtin-call-dce
-ftree-switch-conversion -ftree-tail-merge
-ftree-pre
-ftree-vrp
-fipa-ra
Each of these flags corresponds to a different possible optimization the compiler is allowed to make.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With