Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Big differences in GCC code generation when compiling as C++ vs C

I've been playing around a little bit with x86-64 assembly trying to learn more about the various SIMD extensions that are available (MMX, SSE, AVX).

In order to see how different C or C++ constructs are translated into machine code by GCC I've been using Compiler Explorer which is a superb tool.

During one of my 'play sessions' I wanted to see how GCC could optimize a simple run-time initialization of an integer array. In this case I tried to write the numbers 0 to 2047 to an array of 2048 unsigned integers.

The code looks as follows:

unsigned int buffer[2048];  void setup() {   for (unsigned int i = 0; i < 2048; ++i)   {     buffer[i] = i;   } } 

If I enable optimizations and AVX-512 instructions -O3 -mavx512f -mtune=intel GCC 6.3 generates some really clever code :)

setup():         mov     eax, OFFSET FLAT:buffer         mov     edx, OFFSET FLAT:buffer+8192         vmovdqa64       zmm0, ZMMWORD PTR .LC0[rip]         vmovdqa64       zmm1, ZMMWORD PTR .LC1[rip] .L2:         vmovdqa64       ZMMWORD PTR [rax], zmm0         add     rax, 64         cmp     rdx, rax         vpaddd  zmm0, zmm0, zmm1         jne     .L2         ret buffer:         .zero   8192 .LC0:         .long   0         .long   1         .long   2         .long   3         .long   4         .long   5         .long   6         .long   7         .long   8         .long   9         .long   10         .long   11         .long   12         .long   13         .long   14         .long   15 .LC1:         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16         .long   16 

However, when I tested what would be generated if the same code was compiled using the GCC C-compiler by adding the flags -x c I was really surprised.

I expected similar, if not identical, results but the C-compiler seems to generate much more complicated and presumably also much slower machine code. The resulting assembly is too large to paste here in full, but it can be viewed at godbolt.org by following this link.

A snippet of the generated code, lines 58 to 83, can be seen below:

.L2:         vpbroadcastd    zmm0, r8d         lea     rsi, buffer[0+rcx*4]         vmovdqa64       zmm1, ZMMWORD PTR .LC1[rip]         vpaddd  zmm0, zmm0, ZMMWORD PTR .LC0[rip]         xor     ecx, ecx .L4:         add     ecx, 1         add     rsi, 64         vmovdqa64       ZMMWORD PTR [rsi-64], zmm0         cmp     ecx, edi         vpaddd  zmm0, zmm0, zmm1         jb      .L4         sub     edx, r10d         cmp     r9d, r10d         lea     eax, [r8+r10]         je      .L1         mov     ecx, eax         cmp     edx, 1         mov     DWORD PTR buffer[0+rcx*4], eax         lea     ecx, [rax+1]         je      .L1         mov     esi, ecx         cmp     edx, 2         mov     DWORD PTR buffer[0+rsi*4], ecx         lea     ecx, [rax+2] 

As you can see, this code has a lot of complicated moves and jumps and in general feels like a very complex way of performing a simple array initialization.

Why is there such a big difference in the generated code?

Is the GCC C++-compiler better in general at optimizing code that is valid in both C and C++ when compared to the C-compiler?

like image 809
JonatanE Avatar asked Dec 22 '16 23:12

JonatanE


People also ask

Is GCC for C or C++?

GCC stands for GNU Compiler Collections which is used to compile mainly C and C++ language. It can also be used to compile Objective C and Objective C++.

Is GCC a good compiler?

The GNU compiler collection, GCC, is one of the most famous open-source tools in existence. It is a tool that can be used to compile multiple languages and not just C or C++. The current version of GCC, GCC 11, has full support for C++17 core language features as well as C++17 library features.

Can you compile C++ with GCC?

The GNU C++ compiler provided by GCC is a true C++ compiler--it compiles C++ source code directly into assembly language. Some other C++ "compilers" are translators which convert C++ programs into C, and then compile the resulting C program using an existing C compiler.

Does Clang compile faster than GCC?

Clang is much faster and uses far less memory than GCC. Clang aims to provide extremely clear and concise diagnostics (error and warning messages), and includes support for expressive diagnostics. GCC's warnings are sometimes acceptable, but are often confusing and it does not support expressive diagnostics.


1 Answers

The extra code is for handling misalignment because the instruction used, vmovdqa64, requires 64 byte alignment.

My testing shows that even though the standard doesn't, gcc does allow a definition in another module to override the one here when in C mode. That definition might only comply with the basic alignment requirements (4 bytes) thus the compiler can't rely on the bigger alignment. Technically, gcc emits a .comm assembly directive for this tentative definition, while an external definition uses a normal symbol in the .data section. During linking this symbol takes precedence over the .comm one.

Note if you change the program to use extern unsigned int buffer[2048]; then even the C++ version will have the added code. Conversely, making it static unsigned int buffer[2048]; will turn the C version into the optimized one.

like image 128
Jester Avatar answered Sep 30 '22 02:09

Jester