Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the linker handle identical template instantiations across translation units?

Tags:

Suppose I have two translation-units:

foo.cpp

void foo() {   auto v = std::vector<int>(); } 

bar.cpp

void bar() {   auto v = std::vector<int>(); } 

When I compile these translation-units, each will instantiate std::vector<int>.

My question is: how does this work at the linking stage?

  • Do both instantiations have different mangled names?
  • Does the linker remove them as duplicates?
like image 982
sdgfsdh Avatar asked Jun 02 '17 18:06

sdgfsdh


2 Answers

C++ requires that an inline function definition be present in a translation unit that references the function. Template member functions are implicitly inline, but also by default are instantiated with external linkage. Hence the duplication of definitions that will be visible to the linker when the same template is instantiated with the same template arguments in different translation units. How the linker copes with this duplication is your question.

Your C++ compiler is subject to the C++ Standard, but your linker is not subject to any codified standard as to how it shall link C++: it is a law unto itself, rooted in computing history and indifferent to the source language of the object code it links. Your compiler has to work with what a target linker can and will do so that you can successfully link your programs and see them do what you expect. So I'll show you how the GCC C++ compiler interworks with the GNU linker to handle identical template instantiations in different translation units.

This demonstration exploits the fact that while the C++ Standard requires - by the One Definition Rule - that the instantiations in different translation units of the same template with the same template arguments shall have the same definition, the compiler - of course - cannot enforce any requirement like that on relationships between different translation units. It has to trust us.

So we'll instantiate the same template with the same parameters in different translation units, but we'll cheat by injecting a macro-controlled difference into the implementations in different translation units that will subsequently show us which definition the linker picks.

If you suspect this cheat invalidates the demonstration, remember: the compiler cannot know whether the ODR is ever honoured across different translation units, so it cannot behave differently on that account, and there's no such thing as "cheating" the linker. Anyhow, the demo will demonstrate that it is valid.

First we have our cheat template header:

thing.hpp

#ifndef THING_HPP #define THING_HPP #ifndef ID #error ID undefined #endif  template<typename T> struct thing {     T id() const {         return T{ID};     } };  #endif 

The value of the macro ID is the tracer value we can inject.

Next a source file:

foo.cpp

#define ID 0xf00 #include "thing.hpp"  unsigned foo() {     thing<unsigned> t;     return t.id(); } 

It defines function foo, in which thing<unsigned> is instantiated to define t, and t.id() is returned. By being a function with external linkage that instantiates thing<unsigned>, foo serves the purposes of:-

  • obliging the compiler to do that instantiating at all
  • exposing the instantiation in linkage so we can then probe what the linker does with it.

Another source file:

boo.cpp

#define ID 0xb00 #include "thing.hpp"  unsigned boo() {     thing<unsigned> t;     return t.id(); } 

which is just like foo.cpp except that it defines boo in place of foo and sets ID = 0xb00.

And lastly a program source:

main.cpp

#include <iostream>  extern unsigned foo(); extern unsigned boo();  int main() {     std::cout << std::hex      << '\n' << foo()     << '\n' << boo()     << std::endl;     return 0; } 

This program will print, as hex, the return value of foo() - which our cheat should make = f00 - then the return value of boo() - which our cheat should make = b00.

Now we'll compile foo.cpp, and we'll do it with -save-temps because we want a look at the assembly:

g++ -c -save-temps foo.cpp 

This writes the assembly in foo.s and the portion of interest there is the definition of thing<unsigned int>::id() const (mangled = _ZNK5thingIjE2idEv):

    .section    .text._ZNK5thingIjE2idEv,"axG",@progbits,_ZNK5thingIjE2idEv,comdat     .align 2     .weak   _ZNK5thingIjE2idEv     .type   _ZNK5thingIjE2idEv, @function _ZNK5thingIjE2idEv: .LFB2:     .cfi_startproc     pushq   %rbp     .cfi_def_cfa_offset 16     .cfi_offset 6, -16     movq    %rsp, %rbp     .cfi_def_cfa_register 6     movq    %rdi, -8(%rbp)     movl    $3840, %eax     popq    %rbp     .cfi_def_cfa 7, 8     ret     .cfi_endproc 

Three of the directives at the top are significant:

.section    .text._ZNK5thingIjE2idEv,"axG",@progbits,_ZNK5thingIjE2idEv,comdat 

This one puts the function definition in a linkage section of its own called .text._ZNK5thingIjE2idEv that will be output, if it's needed, merged into the .text (i.e. code) section of program in which the object file is linked. A linkage section like that, i.e. .text.<function_name> is called a function-section. It's a code section that contains only the definition of function <function_name>.

The directive:

.weak   _ZNK5thingIjE2idEv 

is crucial. It classifies thing<unsigned int>::id() const as a weak symbol. The GNU linker recognises strong symbols and weak symbols. For a strong symbol, the linker will accept only one definition in the linkage. If there are more, it will give a multiple -definition error. But for a weak symbol, it will tolerate any number of definitions, and pick one. If a weakly defined symbol also has (just one) strong definition in the linkage then the strong definition will be picked. If a symbol has multiple weak definitions and no strong definition, then the linker can pick any one of the weak definitions, arbitrarily.

The directive:

.type   _ZNK5thingIjE2idEv, @function 

classifies thing<unsigned int>::id() as referring to a function - not data.

Then in the body of the definition, the code is assembled at the address labelled by the weak global symbol _ZNK5thingIjE2idEv, the same one locally labelled .LFB2. The code returns 3840 ( = 0xf00).

Next we'll compile boo.cpp the same way:

g++ -c -save-temps boo.cpp 

and look again at how thing<unsigned int>::id() is defined in boo.s

    .section    .text._ZNK5thingIjE2idEv,"axG",@progbits,_ZNK5thingIjE2idEv,comdat     .align 2     .weak   _ZNK5thingIjE2idEv     .type   _ZNK5thingIjE2idEv, @function _ZNK5thingIjE2idEv: .LFB2:     .cfi_startproc     pushq   %rbp     .cfi_def_cfa_offset 16     .cfi_offset 6, -16     movq    %rsp, %rbp     .cfi_def_cfa_register 6     movq    %rdi, -8(%rbp)     movl    $2816, %eax     popq    %rbp     .cfi_def_cfa 7, 8     ret     .cfi_endproc 

It's identical, except for our cheat: this definition returns 2816 ( = 0xb00).

While we're here, let's note something that might or might not go without saying: Once we're in assembly (or object code), classes have evaporated. Here, we're down to: -

  • data
  • code
  • symbols, which can label data or label code.

So nothing here specifically represents the instantiation of thing<T> for T = unsigned. All that's left of thing<unsigned> in this instance is the definition of _ZNK5thingIjE2idEv a.k.a thing<unsigned int>::id() const.

So now we know what the compiler does about instantiating thing<unsigned> in a given translation unit. If it is obliged to instantiate a thing<unsigned> member function, then it assembles the definition of the instantiated member function at a weakly global symbol that identifies the member function, and it puts this definition into its own function-section.

Now let's see what the linker does.

First we'll compile the main source file.

g++ -c main.cpp 

Then link all the object files, requesting a diagnostic trace on _ZNK5thingIjE2idEv, and a linkage map file:

g++ -o prog main.o foo.o boo.o -Wl,--trace-symbol='_ZNK5thingIjE2idEv',-M=prog.map foo.o: definition of _ZNK5thingIjE2idEv boo.o: reference to _ZNK5thingIjE2idEv 

So the linker tells us that the program gets the definition of _ZNK5thingIjE2idEv from foo.o and calls it in boo.o.

Running the program shows it's telling the truth:

./prog  f00 f00 

Both foo() and boo() are returning the value of thing<unsigned>().id() as instantiated in foo.cpp.

What has become of the other definition of thing<unsigned int>::id() const in boo.o? The map file shows us:

prog.map

... Discarded input sections  ...  ...  .text._ZNK5thingIjE2idEv                 0x0000000000000000        0xf boo.o  ...  ... 

The linker chucked away the function-section in boo.o that contained the other definition.

Let's now link prog again, but this time with foo.o and boo.o in the reverse order:

$ g++ -o prog main.o boo.o foo.o -Wl,--trace-symbol='_ZNK5thingIjE2idEv',-M=prog.map boo.o: definition of _ZNK5thingIjE2idEv foo.o: reference to _ZNK5thingIjE2idEv 

This time, the program gets the definition of _ZNK5thingIjE2idEv from boo.o and calls it in foo.o. The program confirms that:

$ ./prog  b00 b00 

And the map file shows:

... Discarded input sections  ...  ...  .text._ZNK5thingIjE2idEv                 0x0000000000000000        0xf foo.o  ...  ... 

that the linker chucked away the function-section .text._ZNK5thingIjE2idEv from foo.o.

That completes the picture.

The compiler emits, in each translation unit, a weak definition of each instantiated template member in its own function section. The linker then just picks the first of those weak definitions that it encounters in the linkage sequence when it needs to resolve a reference to the weak symbol. Because each of the weak symbols addresses a definition, any one one of them - in particular, the first one - can be used to resolve all references to the symbol in the linkage, and the rest of the weak definitions are expendable. The surplus weak definitions must be ignored, because the linker can only link one definition of a given symbol. And the surplus weak definitions can be discarded by the linker, with no collateral damage to the program, because the compiler placed each one in a linkage section all by itself.

By picking the first weak definition it sees, the linker is effectively picking at random, because the order in which object files are linked is arbitrary. But this is fine, as long as we obey the ODR accross multiple translation units, because it we do, then all of the weak definitions are indeed identical. The usual practice of #include-ing a class template everywhere from a header file (and not macro-injecting any local edits when we do so) is a fairly robust way of obeying the rule.

like image 109
Mike Kinghan Avatar answered Oct 05 '22 23:10

Mike Kinghan


Different implementations use different strategies for this.

The GNU compiler, for example, marks template instantiations as weak symbols. Then at link time, the linker can throw away all definitions but one of the same weak symbol.

The Sun Solaris compiler, on the other hand, does not instantiate templates at all during normal compilation. Then at link time, the linker collects all template instantiations needed to complete the program, and then goes ahead and calls the compiler in a special template-instantiation mode. Thus exactly one instantiation is produced for each template. There are no duplicates to merge or get rid of.

Each approach has its own advantages and disadvantages.

like image 27
n. 1.8e9-where's-my-share m. Avatar answered Oct 05 '22 21:10

n. 1.8e9-where's-my-share m.