I wrote this simple C program:
int main() { int i; int count = 0; for(i = 0; i < 2000000000; i++){ count = count + 1; } }
I wanted to see how the gcc compiler optimizes this loop (clearly add 1 2000000000 times should be "add 2000000000 one time"). So:
gcc test.c and then time
on a.out
gives:
real 0m7.717s user 0m7.710s sys 0m0.000s
$ gcc -O2 test.c and then time on
a.out` gives:
real 0m0.003s user 0m0.000s sys 0m0.000s
Then I disassembled both with gcc -S
. First one seems quite clear:
.file "test.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 movq %rsp, %rbp .cfi_offset 6, -16 .cfi_def_cfa_register 6 movl $0, -8(%rbp) movl $0, -4(%rbp) jmp .L2 .L3: addl $1, -8(%rbp) addl $1, -4(%rbp) .L2: cmpl $1999999999, -4(%rbp) jle .L3 leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits
L3 adds, L2 compare -4(%rbp)
with 1999999999
and loops to L3 if i < 2000000000
.
Now the optimized one:
.file "test.c" .text .p2align 4,,15 .globl main .type main, @function main: .LFB0: .cfi_startproc rep ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits
I can't understand at all what's going on there! I've got little knowledge of assembly, but I expected something like
addl $2000000000, -8(%rbp)
I even tried with gcc -c -g -Wa,-a,-ad -O2 test.c to see the C code together with the assembly it was converted to, but the result was no more clear that the previous one.
Can someone briefly explain:
The compiler optimizes to reduce the size of the binary instead of execution speed. If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code.
GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O , this option increases both compilation time and the performance of the generated code.
Use the command-line option -O0 (-[capital o][zero]) to disable optimization, and -S to get assembly file. Look here to see more gcc command-line options. Show activity on this post.
The compiler is even smarter than that. :)
In fact, it realizes that you aren't using the result of the loop. So it took out the entire loop completely!
This is called Dead Code Elimination.
A better test is to print the result:
#include <stdio.h> int main(void) { int i; int count = 0; for(i = 0; i < 2000000000; i++){ count = count + 1; } // Print result to prevent Dead Code Elimination printf("%d\n", count); }
EDIT : I've added the required #include <stdio.h>
; the MSVC assembly listing corresponds to a version without the #include
, but it should be the same.
I don't have GCC in front of me at the moment, since I'm booted into Windows. But here's the disassembly of the version with the printf()
on MSVC:
EDIT : I had the wrong assembly output. Here's the correct one.
; 57 : int main(){ $LN8: sub rsp, 40 ; 00000028H ; 58 : ; 59 : ; 60 : int i; int count = 0; ; 61 : for(i = 0; i < 2000000000; i++){ ; 62 : count = count + 1; ; 63 : } ; 64 : ; 65 : // Print result to prevent Dead Code Elimination ; 66 : printf("%d\n",count); lea rcx, OFFSET FLAT:??_C@_03PMGGPEJJ@?$CFd?6?$AA@ mov edx, 2000000000 ; 77359400H call QWORD PTR __imp_printf ; 67 : ; 68 : ; 69 : ; 70 : ; 71 : return 0; xor eax, eax ; 72 : } add rsp, 40 ; 00000028H ret 0
So yes, Visual Studio does this optimization. I'd assume GCC probably does too.
And yes, GCC performs a similar optimization. Here's an assembly listing for the same program with gcc -S -O2 test.c
(gcc 4.5.2, Ubuntu 11.10, x86):
.file "test.c" .section .rodata.str1.1,"aMS",@progbits,1 .LC0: .string "%d\n" .text .p2align 4,,15 .globl main .type main, @function main: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp movl $2000000000, 8(%esp) movl $.LC0, 4(%esp) movl $1, (%esp) call __printf_chk leave ret .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2" .section .note.GNU-stack,"",@progbits
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With