Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does GCC optimize out an unused variable incremented inside a loop?

I wrote this simple C program:

int main() {     int i;     int count = 0;     for(i = 0; i < 2000000000; i++){         count = count + 1;     } } 

I wanted to see how the gcc compiler optimizes this loop (clearly add 1 2000000000 times should be "add 2000000000 one time"). So:

gcc test.c and then time on a.out gives:

real 0m7.717s   user 0m7.710s   sys 0m0.000s   

$ gcc -O2 test.c and then time ona.out` gives:

real 0m0.003s   user 0m0.000s   sys 0m0.000s   

Then I disassembled both with gcc -S. First one seems quite clear:

    .file "test.c"       .text   .globl main     .type   main, @function   main: .LFB0:     .cfi_startproc     pushq   %rbp     .cfi_def_cfa_offset 16     movq    %rsp, %rbp     .cfi_offset 6, -16     .cfi_def_cfa_register 6     movl    $0, -8(%rbp)     movl    $0, -4(%rbp)     jmp .L2 .L3:     addl    $1, -8(%rbp)     addl    $1, -4(%rbp) .L2:     cmpl    $1999999999, -4(%rbp)     jle .L3     leave     .cfi_def_cfa 7, 8     ret     .cfi_endproc .LFE0:     .size   main, .-main     .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"     .section    .note.GNU-stack,"",@progbits 

L3 adds, L2 compare -4(%rbp) with 1999999999 and loops to L3 if i < 2000000000.

Now the optimized one:

    .file "test.c"       .text     .p2align 4,,15 .globl main     .type main, @function main: .LFB0:     .cfi_startproc     rep     ret     .cfi_endproc .LFE0:     .size main, .-main     .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"     .section .note.GNU-stack,"",@progbits 

I can't understand at all what's going on there! I've got little knowledge of assembly, but I expected something like

addl $2000000000, -8(%rbp) 

I even tried with gcc -c -g -Wa,-a,-ad -O2 test.c to see the C code together with the assembly it was converted to, but the result was no more clear that the previous one.

Can someone briefly explain:

  1. The gcc -S -O2 output.
  2. If the loop is optimized as I expected (one sum instead of many sums)?
like image 694
Haile Avatar asked Jan 12 '12 20:01

Haile


People also ask

What optimization does gcc do?

The compiler optimizes to reduce the size of the binary instead of execution speed. If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code.

Is gcc an optimizing compiler?

GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O , this option increases both compilation time and the performance of the generated code.

How do I stop gcc optimization?

Use the command-line option -O0 (-[capital o][zero]) to disable optimization, and -S to get assembly file. Look here to see more gcc command-line options. Show activity on this post.


1 Answers

The compiler is even smarter than that. :)

In fact, it realizes that you aren't using the result of the loop. So it took out the entire loop completely!

This is called Dead Code Elimination.

A better test is to print the result:

#include <stdio.h> int main(void) {     int i; int count = 0;     for(i = 0; i < 2000000000; i++){         count = count + 1;     }      //  Print result to prevent Dead Code Elimination     printf("%d\n", count); } 

EDIT : I've added the required #include <stdio.h>; the MSVC assembly listing corresponds to a version without the #include, but it should be the same.


I don't have GCC in front of me at the moment, since I'm booted into Windows. But here's the disassembly of the version with the printf() on MSVC:

EDIT : I had the wrong assembly output. Here's the correct one.

; 57   : int main(){  $LN8:     sub rsp, 40                 ; 00000028H  ; 58   :  ; 59   :  ; 60   :     int i; int count = 0; ; 61   :     for(i = 0; i < 2000000000; i++){ ; 62   :         count = count + 1; ; 63   :     } ; 64   :  ; 65   :     //  Print result to prevent Dead Code Elimination ; 66   :     printf("%d\n",count);      lea rcx, OFFSET FLAT:??_C@_03PMGGPEJJ@?$CFd?6?$AA@     mov edx, 2000000000             ; 77359400H     call    QWORD PTR __imp_printf  ; 67   :  ; 68   :  ; 69   :  ; 70   : ; 71   :     return 0;      xor eax, eax  ; 72   : }      add rsp, 40                 ; 00000028H     ret 0 

So yes, Visual Studio does this optimization. I'd assume GCC probably does too.

And yes, GCC performs a similar optimization. Here's an assembly listing for the same program with gcc -S -O2 test.c (gcc 4.5.2, Ubuntu 11.10, x86):

        .file   "test.c"         .section        .rodata.str1.1,"aMS",@progbits,1 .LC0:         .string "%d\n"         .text         .p2align 4,,15 .globl main         .type   main, @function main:         pushl   %ebp         movl    %esp, %ebp         andl    $-16, %esp         subl    $16, %esp         movl    $2000000000, 8(%esp)         movl    $.LC0, 4(%esp)         movl    $1, (%esp)         call    __printf_chk         leave         ret         .size   main, .-main         .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"         .section        .note.GNU-stack,"",@progbits 
like image 134
Mysticial Avatar answered Sep 20 '22 07:09

Mysticial