How to make C++ linking take less memory

Tags:

I am working on a university research project in C++ with a lot of templates which have further nested templates and so on. The project is about efficient index data structures for a specific area of research. You can imagine: An index structure has a lot of parameters to adjust, so we use the template parameters excessively. Of course, we want to test our indexes with different sets of parameters, so there are quite a lot of template instantiations.

The project is not that huge. Maybe 50k LOC. But still, linking take 50 seconds and consumes over 7 GB memory (!!!). I am on a 32GB workstation so everything is fine for me. I often have bachelor and master students working on this project. The problem is that they often work on Laptops with 4 or 8 GB of RAM. Thus, these students have great troubles compiling that project. The resulting test binary (i.e. a binary that simply contains unit tests for the index structures) is 700 megabytes. Most of that are symbols, because the nested templates produce huge names. If I use strip on the binary, it drops down to 8 megabytes.

So is there a way to reduce the RAM usage during linkage? And is there a way to have smaller symbols even with nested templates?

We compile using g++4.9 with std=c++11 under Ubuntu 14.10.

Edit:

It really seems to be the nested templates. We have two test cases with really deeply nested templates. The two .o files for these test make up almost 90% of the memory of the final binary. They result in method names that are over 3000 characters long. There is no way to not use the nested templates here as they form a "processing tree" of an example query. Is there any way to keep names short when using deeply nested templates?

678

asked Sep 25 '14 10:09

gexicide

3 Answers

So is there a way to reduce the RAM usage during linkage? And is there a way to have smaller symbols even with nested templates?

Have you considered using the pimpl idiom in client code?

Consider a situation where you have this include chain:

A.h -> B.h -> C.h -> D.h (C includes D, B includes C, etc)

Suppose that A.h defines class AA, B.h defines class B and so on (with AA implemented in terms of BB, BB implemented in terms of CC and so on).

If DD is a large template and used in the implementation of CC, the templated code will be compiled three times, for the compilation units A, B and C.

Now, consider what happens if instead of C.h including D.h, you have the following situation:

C.h foward declares a CCImpl *pimpl and forwards all it's methods to pImpl-> methods (and does not include D.h).

C.cpp includes C.h and D.h, and implements CCImpl and CC.

Now, D will be included once (and compiled once, for C.cpp). A and B will only include C.h, with a CImpl forward declaration. A.h, B.h and C.h no longer know that a template exists.

114

answered Sep 22 '22 13:09

utnapistim

GCC has a garbage collection scheme for the RAM it uses.

The parameters ggc-min-expand and ggc-min-heapsize are used to determine when GCC should clean up and dealloc it's unused memory (their defaults are percentages of the total system memory).

You could try something like:

g++ --param ggc-min-expand=0 --param ggc-min-heapsize=8192

From the GCC manual:

ggc-min-expand

GCC uses a garbage collector to manage its own memory allocation. This parameter specifies the minimum percentage by which the garbage collector's heap should be allowed to expand between collections. Tuning this may improve compilation speed; it has no effect on code generation.

The default is 30% + 70% * (RAM/1GB) with an upper bound of 100% when RAM >= 1GB. If getrlimit is available, the notion of "RAM" is the smallest of actual RAM, RLIMIT_RSS, RLIMIT_DATA and RLIMIT_AS. If GCC is not able to calculate RAM on a particular platform, the lower bound of 30% is used. Setting this parameter and ggc-min-heapsize to zero causes a full collection to occur at every opportunity. This is extremely slow, but can be useful for debugging.

ggc-min-heapsize

Minimum size of the garbage collector's heap before it begins bothering to collect garbage. The first collection occurs after the heap expands by ggc-min-expand% beyond ggc-min-heapsize. Again, tuning this may improve compilation speed, and has no effect on code generation. The default is RAM/8, with a lower bound of 4096 (four megabytes) and an upper bound of 131072 (128 megabytes). If getrlimit is available, the notion of "RAM" is the smallest of actual RAM, RLIMIT_RSS, RLIMIT_DATA and RLIMIT_AS. If GCC is not able to calculate RAM on a particular platform, the lower bound is used. Setting this parameter very large effectively disables garbage collection. Setting this parameter and ggc-min-expand to zero causes a full collection to occur at every opportunity.

Further details:

http://hostingfu.com/article/compiling-with-gcc-on-low-memory-vps

answered Sep 21 '22 13:09

manlio

Use inheritance judiciously. You may have a class Foo<1,4,8,1,9,int, std::string> as a base class for class Bar, and then the object file will mention just Bar.

Note that typedef does not introduce names for linking purposes.

[edit] And to address a performance concern from another comment, an empty derived class adds no overhead on common compilers at normal optimization levels (and often there's no overhead even in debug builds)

answered Sep 20 '22 13:09

MSalters

Related questions
                            
                                Are STL containers allowed to skip calling allocator::construct and allocator::destroy if the object is trivially constructible/destructible?
                            
                                C++ - Efficient way to generate random bitset with configurable mean "1s to 0s" ratio
                            
                                Basic compile time format string checking using constexpr
                            
                                How to hash an unordered_map?
                            
                                MessageBox "Abnormal program termination" keeps my application running
                            
                                Specialization of template in different namespace
                            
                                Did template concepts get to c++14?
                            
                                Design Application with Qt
                            
                                What is this C++ casting code doing?
                            
                                Why does gcc compiler output pow(10,2) as 99 not 100? [duplicate]
                            
                                Undefined results with std::bind and duplicate placeholders
                            
                                Qt QGridLayout - removing item spacing
                            
                                Why pass array as "int *& name"?
                            
                                TCP Zero copy using boost
                            
                                Is it possible to do this lambda event manager in C++?
                            
                                enable_if type is not of a certain template class
                            
                                Is it always safe to use C++14's auto function type return deduction in place of std::common_type?
                            
                                How to build/deploy RPM(s) for new Boost version on RHEL?
                            
                                Substitution of void as parameter to templated method
                            
                                Linking static libraries with clang independent of order

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to make C++ linking take less memory

Tags:

c++

c++11

templates

linker