Identify slow-to-compile function

Tags:

I have some cpp files that take a lot to compile. They contain some basic classes/code, with some templates, but nothing to justify compile time on the order of dozens of seconds.

I do use a couple of external libs (boost/opencv)

This is what gcc says about the compilation time. How can I find the library/include/function call that's to blame for the horrendous compilation time?

Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall    1445 kB ( 0%) ggc
 phase parsing           :   6.69 (46%) usr   1.61 (60%) sys  12.14 (47%) wall  488430 kB (66%) ggc
 phase lang. deferred    :   1.59 (11%) usr   0.36 (13%) sys   3.83 (15%) wall   92964 kB (13%) ggc
 phase opt and generate  :   6.25 (43%) usr   0.72 (27%) sys  10.09 (39%) wall  152799 kB (21%) ggc
 |name lookup            :   1.05 ( 7%) usr   0.28 (10%) sys   2.01 ( 8%) wall   52063 kB ( 7%) ggc
 |overload resolution    :   0.83 ( 6%) usr   0.18 ( 7%) sys   1.48 ( 6%) wall   42377 kB ( 6%) ggc
...

Profiling the C++ compilation process deals with identifying the slow file, but I'd need more fine-grained information to find the culprit

(Other files/projects compile in milliseconds/seconds, so it's not a matter of computer resources. I use gcc 4.9.1)

394

asked Mar 07 '15 19:03

Sam

2 Answers

There are basically two things that cause long compilation times: too many includes and too many templates.

When you are including too many headers and that these headers are including too many headers of their own, it just means that the compiler has a lot of work to do to load all these files and it will spend an inordinate amount of time on the processing passes that it has to do on all code, regardless of whether its actually used or not, like pre-processing, lexical analysis, AST building, etc.. This can be especially problematic when code is spread over a large number of small headers, because the performance is very much I/O bound (lots of time wasted just fetching and reading files from the hard-disk). Unfortunately, Boost libraries tend to be very much structured this way.

Here are a couple of ways or tools to solve this problem:

You can use the "include-what-you-use" tool. This is a Clang-based analysis tool that basically looks at what you are actually using in your code, and which headers those things come from, and then reports on any potential optimizations you could make by removing certain unnecessary includes, using forward-declarations instead, or maybe replace the broader "all-in-one" headers with the more fine-grained headers.
Most compilers have options to dump the preprocessed sources (on GCC / Clang, it's -E or -E -P options, or simply used GCC's C preprocessor program cpp directly). You can take your source file and comment out different include statements or groups of include statements, and dump the preprocessed source to see the total amount of code that these different headers pull in (and maybe use a line count command, like $ g++ -E -P my_source.cpp | wc -l). This could help you identify, in sheer number of lines of code to process, which headers are the worst offenders. Then, you can see what you can do to avoid them or mitigate the issue somehow.
You can also use pre-compiled headers. This is a feature supported by most compilers with which you can specify certain headers (especially oft-included "all-in-one" headers) to be pre-compiled to avoid re-parsing them for every source file that includes them.
If your OS supports it, you can use a ram-disk for your code and the headers of your external libraries. This essentially takes up part of your RAM memory and makes it look like a normal hard-disk / file-system. This can significantly reduce compilation times by reducing the I/O latency, since all the headers and source files are read from RAM memory instead of the actual hard-disk.

The second problem is that of template instantiations. In your time report from GCC, there should be a time value reported somewhere for template instantiation phase. If that number is high, which it will be as soon as there is any significant amount of templates meta-programming involved in the code, then you will need to work on that problem. There are lots of reasons why some template-heavy code can be painfully slow to compile, including deeply recursive instantiation patterns, overly fancy Sfinae tricks, abuse of type-traits and concepts checking, and good old fashion over-engineered generic code. But there are also simple tricks that can fix a lot of issues, like using unnamed namespaces (to avoid all the time wasted generating symbols for instantiations that don't really need to be visible outside the translation unit) and specializing type-traits or concept checks templates (to basically "short-circuit" much of the fancy meta-programming that goes into them). Another potential solution for template instantiations is to use "extern templates" (from C++11) to control where specific template instantiations should be instantiated (e.g., in a separate cpp file) and avoid re-instantiating it everywhere it's used.

Here are a couple of ways or tools to help you identify the bottlenecks:

You can use the "Templight" profiling tool (and its auxiliary "Templight-tools" for dealing with the traces). This is again a Clang-based tool that can be used as a drop-in replacement for the Clang compiler (the tool is actually an instrumented full-blown compiler) and it will generate a complete profile of all the template instantiations that occur during compilation, including the time spent on each (and optionally, memory consumption estimates, although this will affect the timing values). The traces can later be converted to a Callgrind format and be visualized in KCacheGrind, just read the description on that on the templight-tools page. This can basically be used like a typical run-time profiler, but for profiling the time and memory consumption when compiling template-heavy code.
A more rudimentary way of going about finding the worst offenders is to create test source files that instantiate particular templates that you suspect are responsible for the long compilation times. Then, you compile those files, time it, and try to work your way (maybe in a "binary search" fashion) towards the worst offenders.

But even with these tricks, identifying template instantiation bottlenecks is easier than actually solving them. So, good luck with that.

110

answered Oct 10 '22 11:10

Mikael Persson

This can't be fully answered without information about how your source files are organised and built, so just some general observations.

Template instantiations can increase compile times a lot, particularly if complicated templates are instantiated for several different types/parameters in each of multiple source files. Schemes for explicit template instantiation (i.e. making sure the templates are only instantiated in a few source files rather than all of them) can reduce compilation times in such circumstances (as well as link time, and executable file size). You need to read compiler documentation for how to do this - it does not necessarily occur by default and can mean restructuring your code to support it.
Header files that are #included in many source files, whether needed or not, tend to increase compilation times. I saw one case where a team member wrote a "globals.h" that #included everything, and #included that everywhere - and the build times (in a large project) were increased by an order of magnitude. It's a double whammy - the compilation time of each source file is increased, and that is multiplied by the number of source files that directly or indirectly #include that header. If turning on features like "precompiled headers" causes a speed-up of build times for the second and subsequent builds, this is probably a contributor. (You might view precompiled headers as a solution to this, but bear in mind there are other trade-offs with using them).
If you are using external libs, check to make sure they are installed and configured locally. A compilation process that silently goes looking on the internet for some component (e.g. a hard-coded header file name that is on some remote server) will slow things considerably. You'd be surprised how often that happens with third-party libraries.

Beyond that, techniques to find the problem depend on how your build process is structured.

If you're using a makefile (or some other means) that compiles source files separately, then use some way to time the individual compilation and linking commands. Bear in mind that it may be the link time that dominates.

If you're using a single compilation command (e.g. gcc invoked on multiple files in one command) then break it up into individual commands for each source file.

Once you've isolated which source file (if any) is the offender, then selectively eliminate some sections from it to find which code within it is the problem. As Yakk said in comment, use a "binary search" for this to eliminate functions within the file. I'd suggest removing whole functions first (to narrow down to the offending function) and then use the same technique within an offending function.

It does help to structure your code so the number of functions per file is reasonably small. That reduces need to rebuild large files for a minor change of one function, and helps isolate such problems more easily in future.

answered Oct 10 '22 13:10

Rob

Related questions
                            
                                What does a dangerous relocation error mean?
                            
                                How can a std::reference_wrapper to a rvalue lambda work?
                            
                                Limit of multiple inheritance in C++
                            
                                what's the difference between mid=(beg+end)/2 and mid=beg+(end-beg)/2 in binary search?
                            
                                Optimal way to access std::tuple element in runtime by index
                            
                                Bilinear interpolation in C/C++ and CUDA
                            
                                Making an adjacency list in C++ for a directed graph
                            
                                "type-switch" construct in C++11
                            
                                STL "erase-remove" idiom: Why not "resize-remove"?
                            
                                How to create a variadic template function with `std::function` as a function parameter?
                            
                                Windows 7 exception code: 0xc0000409
                            
                                enable_shared_from_this not working on xcode 5
                            
                                why C++ operator overloading requires "having at least one parameter of class type"?
                            
                                How to add include path to flycheck c/c++-clang?
                            
                                read huge text file line by line in C++ with buffering
                            
                                C++ operator overload performance issue
                            
                                How do I compute the absolute value of a vector in Eigen?
                            
                                How to find modulo of a sum of numbers?
                            
                                Efficiently reading a very large text file in C++
                            
                                typeid(complex<double>(0.0,1.0)) != typeid(1.0i)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Identify slow-to-compile function

Tags:

c++

performance

gcc

compilation

boost

Sam

People also ask

2 Answers

Mikael Persson

Rob

Recent Activity

Donate For Us