Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why huge compilation time difference between g++ and clang++?

Tags:

c++

gcc

clang

i came across a talk slide and at page 12 there is an example illustrating the difficulty of type checking in the presence of inheritance.

struct a {
  typedef int foo;
};
struct a1 : a {};
struct a2 : a {};
#define X(b, a)                \
  struct a##1 : b##1, b##2 {}; \
  struct a##2 : b##1, b##2 {};
X(a, b) X(b, c) X(c, d) X(d, e) X(e, f) X(f, g) X(g, h) X(h, i) X(i, j) X(j, k)
    X(k, l) X(l, m) X(m, n) n1::foo main() {}

After preprocessing it is tranformed into:

struct a {
  typedef int foo;
};
struct a1 : a {};
struct a2 : a {};

struct b1 : a1, a2 {};
struct b2 : a1, a2 {};
struct c1 : b1, b2 {};
struct c2 : b1, b2 {};
struct d1 : c1, c2 {};
struct d2 : c1, c2 {};
struct e1 : d1, d2 {};
struct e2 : d1, d2 {};
struct f1 : e1, e2 {};
struct f2 : e1, e2 {};
struct g1 : f1, f2 {};
struct g2 : f1, f2 {};
struct h1 : g1, g2 {};
struct h2 : g1, g2 {};
struct i1 : h1, h2 {};
struct i2 : h1, h2 {};
struct j1 : i1, i2 {};
struct j2 : i1, i2 {};
struct k1 : j1, j2 {};
struct k2 : j1, j2 {};
struct l1 : k1, k2 {};
struct l2 : k1, k2 {};
struct m1 : l1, l2 {};
struct m2 : l1, l2 {};
struct n1 : m1, m2 {};
struct n2 : m1, m2 {};
n1::foo main() {}

when I compile it with g++(Debian 4.9.1-16) ,it does consume a lot of time:

$ time g++ type_check.cc
real    29.35s
user    29.34s
sys     0.00s

while clang++(clang version 3.4.2 (tags/RELEASE_34/dot2-final), x86_64-unknown-linux-gnu) does it much faster.

$ time clang++ type_check.cc

real    0.06s
user    0.04s
sys     0.00s

Why check whether the types of foo on both sides of the multiple inheritance match cost so much time for gcc/icc, while not for clang?

like image 651
Hongxu Chen Avatar asked Oct 23 '14 05:10

Hongxu Chen


People also ask

Does Clang compile faster than GCC?

Clang is much faster and uses far less memory than GCC. Clang aims to provide extremely clear and concise diagnostics (error and warning messages), and includes support for expressive diagnostics. GCC's warnings are sometimes acceptable, but are often confusing and it does not support expressive diagnostics.

Is GCC faster than LLVM?

While LLVM's Clang C/C++ compiler was traditionally known for its faster build speeds than GCC, in recent releases of GCC the build speeds have improved and in some areas LLVM/Clang has slowed down with further optimization passes and other work added to its growing code-base.

Is G ++ a Clang?

Show activity on this post. In modern versions of macOS, g++ is just a little shim that points to the relevant part of clang in whichever version of Xcode you have installed.

Is Clang a good compiler?

Clang is not just like any other compiler, it comes with an Infrastructure that allows it to build tools and it can also extend its behaviors easily. The LLVM/Clang source code includes many tools while others are available on the web. Clang can be used as a very good C/C++ parser for building tools.


1 Answers

I guess it is because of a premature optimization and so a bug.

I think GCC keeps track of inheritance and like Clang memoize base classes to spot ambiguities immediatly when parsing "next base class", but differently from Clang GCC skip this fase if there are no members (however it should consider typedef to be members in my opinion)

The proof:

alter struct a

struct a {
    int a; //add this
    typedef int foo;
};

and code would compile as fast as (same order of magnitude at least) Clang.

Tested on GCC 4.8.0/4.8.1/4.9.0

EDIT:

Since @HongxuChen asked more information I'm editing now: those are all personal considerations wich may also not be correct, but seems reasonable to me.

I'm not surprised compile times explode with so deep inheritance tree without memoization.

Basically you have exponential complexity, if you try to add

X(a, b) X(b, c) X(c, d) X(d, e) X(e, f) X(f, g) X(g, h) X(h, i) X(i, j) X(j, k)
X(k, l) X(l, m) X(m, n) 
X(n, o)                 //add this one
o1::foo main() {}

you'll find compile time is now twice.

Each time the compiler has do to twice number of checks, it is not surprinsing that it grows so fast, especially considering that internally a compiler do complex operations that may require thousands if not more, assembly instructions and cache misses.

We have 2^14 operations in your case (twice 2^13). Wich assuming 2Ghz single core means 16.384 operations over 39 seconds wich means 420 operations/second and so 4.7 milions instructions/operation.

It seems that 4.7 millions instructions/operations is so much, but it is not necessarily. Increasing compilation time by few seconds is incredibly easy..

It would be interesting to force a scenario where both Clang and GCC can't use memoization and see wich one is faster, but I have no idea about how doing that, and anyway they should use memoization to save most work.

Note: I know I not investigated directly GCC code, that would probably be a heavy task also for GCC developers. I have minimum experience writing parsers and metacompilers so I partially faced most common problems that may arise during compilation (not GCC directly), other users are certainly more qualified than me to answer this question.

like image 74
CoffeDeveloper Avatar answered Nov 15 '22 20:11

CoffeDeveloper