I'm not entirely clear on how the new extern template
feature is meant to work in C++11. I understand that it is intended to help speed up compilation time, and simplify linking issues with shared libraries. Does that mean that the compiler does not even parse the function body, forcing a non-inlined call to be made? Or does it simply instruct the compiler to not generate an actual method body when a non-inlined call is made? Obviously, link-time code generation not withstanding.
As a concrete example of where the difference might matter, consider a function that operates on an incomplete type.
//Common header
template<typename T>
void DeleteMe(T* t) {
delete t;
}
struct Incomplete;
extern template void DeleteMe(Incomplete*);
//Implementation file 1
#include common_header
struct Incomplete { };
template void DeleteMe(Incomplete*);
//Implementation file 2
#include common_header
int main() {
Incomplete* p = factory_function_not_shown();
DeleteMe(p);
}
Within "Implementation file 2", it is unsafe to delete
a pointer to Incomplete
. So an inlined version of DeleteMe
would fail. But if it is left as an actual function call, and the function itself were generated within "Implementation file 1", everything will work correctly.
As a corollary, are the rules the same for member functions of templated classes with a similar extern template class
declaration?
For experimental purposes, MSVC produces the correct output to the above code, but if the extern
line is removed generates a warning about deleting an incomplete type. However, this is the remnants of a non-standard extension they introduced years ago so I'm not sure how much I can trust this behavior. I don't have access to any other build environments to experiment on [save ideone et al, but being limited to one translation unit is rather limiting in this case].
Extern templatesA template specialization can be explicitly declared as a way to suppress multiple instantiations. For example: #include "MyVector. h" extern template class MyVector<int>; // Suppresses implicit instantiation below --
An explicit specialization of a template is a function, not a template. That function does not become inline just because the template that was specialized is marked with inline . So inline on the template is completely irrelevant.
Yes, you need the inline specifier there. The ODR (one-definition rule) states that there must be exactly one definition of a variable, function, class, enum or template.
The idea behind extern templates is to make explicit template instantiations more useful.
As you know, in C++03, you can explicitly instantiate a template using this syntax:
template class SomeTemplateClass<int>;
template void foo<bool>();
This tells the compiler to instantiate the template in the current translation unit. However, this doesn't stop implicit instantiations from happening: the compiler still has to perform all implicit instantiations and then merge them together again during linking.
Example:
// a.h
template <typename> void foo() { /* ... */ }
// a.cpp
#include "a.h"
template void foo<int>();
// b.cpp
#include "a.h"
int main()
{
foo<int>();
return 0;
}
Here, a.cpp
explicitly instantiates foo<int>()
, but once we go to compile b.cpp
, it will instantiate it again because b.cpp
has no idea that a.cpp
is going to instantiate it anyway. For large functions with many different translation units doing implicit instantiations, this can add quite significantly to compile and link time. It may also cause the function to be unnecessarily inlined, which can lead to significant code bloat.
With extern templates, you can let other source files know that you plan to instantiate the template explicitly:
// a.h
template <typename> void foo() { /* ... */ }
extern template void foo<int>();
This way, b.cpp
won't cause an instantiation of foo<int>()
. The function will be instantiated in a.cpp
and will be linked like any normal function. It's also much less likely to be inlined.
Note that this doesn't prevent inlining -- the function could still be inlined at link time in exactly the same way that a normal non-inline function can still be inlined.
EDIT: For those that are curious, I just did a quick test to see how much time g++ spends instantiating templates. I tried instantiating std::sort<int*>
in a varying number of translation units, with and without the instantiation being suppressed. The result was conclusive: 30ms per instantiation of std::sort. There's definitely time to be saved here in a large project.
Here is an interesting example :
#include <algorithm>
#include <string>
extern template class std::basic_string<char>;
int foo(std::string s)
{
int res = s.length();
res += s.find("some substring");
return res;
}
When compiled with g++-7.2 at -O3, this produces a non-inlined call to string::find BUT an inlined call to string::size.
While without the extern template, everything is indeed inlined. Clang has the same behaviour and MSVC is almost unable to inline anything in any case.
So the anwser is : it depends, and compilers may have special heuristics for this.
Using extern template class
does not seem to prevent inlining. I will illustrate this via an example, it is a bit involved but the simplest I can come up with.
In file a.h we define template class CFoo
,
#ifndef A_H
#define A_H
#include <iostream>
template <typename T> class CFoo{
public: CFoo(){
std::cout << "CFoo Constructor, edit 0" << std::endl;
}
};
extern template class CFoo<int>;
#endif
At the end of a.h we use extern template class CFoo<int>
to indicate to any translation unit with #include a.h
that it does not need to generate any code for CFoo. It's a promise we make that all things CFoo will link smoothly.
In file c.cpp we have,
#include "a.h"
void run(){
CFoo<int> cf;
}
Due to the extern template class
promise' at the end of a.h, the translation unit of c.cpp does not
need to' generate any code for class CFoo.
Finally we declare a main function in b.cpp,
void run();
int main(){
run();
return 0;
}
There is nothing fancy in b.cpp, we simply declare void run()
which will be linked to the implementation of the translation unit b.cpp at link-time. For completeness, here is a makefile
cflags = -std=c++11 -O1
b : b.o a.o c.o
g++ ${cflags} b.o a.o c.o -o b
b.o : b.cpp
g++ ${cflags} -c b.cpp -o b.o
c.o : c.cpp
g++ ${cflags} -c c.cpp -o c.o
a.o : a.cpp a.h
g++ ${cflags} -c a.cpp -o a.o
clean:
rm -rf a.o b.o c.o b
Using this makefile compiles and links an executable a which outputs ``CFoo Constructor, edit 0'' when run. But note! In the example above we do not seem to have declared CFoo<int>
anywhere : CFoo<int>
is definitely not declared on translation unit b.cpp as the header does not appear on that translation unit, and translation unit c.cpp was told that it didn't need to implement CFoo. So what's going on?
Make one change to the makefile : replace -O1 with -O0, and make clean make
Now, the link call results in an error (using gcc 4.8.4)
c.o: In function `run()':
c.cpp:(.text+0x10): undefined reference to `CFoo<int>::CFoo()'
This is the error which we would expect if there were no inlining in the first place. At least this is the conclusion I come to, further ideas are very welcome.
To get linking with -O1, we need to keep our promise and provide an implementation of CFoo, this we provide in file a.cpp
#include "a.h"
template void foo<int>();
We can now be guaranteed that CFoo appears on the translation unit of a.cpp, and our promise will be kept. As an aside, note that template void foo<int>()
in a.cpp in preceded by extern template void foo<int>()
via the inclusion of a.h, which is not problematic.
Finally, I find this unpredictable optimisation dependent behaviour annoying, as it means that modifications to a.h and recompilation of a.cpp might not be reflected in run()
as expected if there were no inlining (try changing standard output of Foo constructor and remake).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With