Does extern template prevent inlining of functions?

Tags:

I'm not entirely clear on how the new extern template feature is meant to work in C++11. I understand that it is intended to help speed up compilation time, and simplify linking issues with shared libraries. Does that mean that the compiler does not even parse the function body, forcing a non-inlined call to be made? Or does it simply instruct the compiler to not generate an actual method body when a non-inlined call is made? Obviously, link-time code generation not withstanding.

As a concrete example of where the difference might matter, consider a function that operates on an incomplete type.

//Common header
template<typename T>
void DeleteMe(T* t) {
    delete t;
}

struct Incomplete;
extern template void DeleteMe(Incomplete*);

//Implementation file 1
#include common_header
struct Incomplete { };
template void DeleteMe(Incomplete*);

//Implementation file 2
#include common_header
int main() {
   Incomplete* p = factory_function_not_shown();
   DeleteMe(p);
}

Within "Implementation file 2", it is unsafe to delete a pointer to Incomplete. So an inlined version of DeleteMe would fail. But if it is left as an actual function call, and the function itself were generated within "Implementation file 1", everything will work correctly.

As a corollary, are the rules the same for member functions of templated classes with a similar extern template class declaration?

For experimental purposes, MSVC produces the correct output to the above code, but if the extern line is removed generates a warning about deleting an incomplete type. However, this is the remnants of a non-standard extension they introduced years ago so I'm not sure how much I can trust this behavior. I don't have access to any other build environments to experiment on [save ideone et al, but being limited to one translation unit is rather limiting in this case].

631

asked Jul 17 '11 19:07

Dennis Zickefoose

3 Answers

The idea behind extern templates is to make explicit template instantiations more useful.

As you know, in C++03, you can explicitly instantiate a template using this syntax:

template class SomeTemplateClass<int>;
template void foo<bool>();

This tells the compiler to instantiate the template in the current translation unit. However, this doesn't stop implicit instantiations from happening: the compiler still has to perform all implicit instantiations and then merge them together again during linking.

Example:

// a.h
template <typename> void foo() { /* ... */ }

// a.cpp
#include "a.h"
template void foo<int>();

// b.cpp
#include "a.h"
int main()
{
    foo<int>();
    return 0;
}

Here, a.cpp explicitly instantiates foo<int>(), but once we go to compile b.cpp, it will instantiate it again because b.cpp has no idea that a.cpp is going to instantiate it anyway. For large functions with many different translation units doing implicit instantiations, this can add quite significantly to compile and link time. It may also cause the function to be unnecessarily inlined, which can lead to significant code bloat.

With extern templates, you can let other source files know that you plan to instantiate the template explicitly:

// a.h
template <typename> void foo() { /* ... */ }
extern template void foo<int>();

This way, b.cpp won't cause an instantiation of foo<int>(). The function will be instantiated in a.cpp and will be linked like any normal function. It's also much less likely to be inlined.

Note that this doesn't prevent inlining -- the function could still be inlined at link time in exactly the same way that a normal non-inline function can still be inlined.

EDIT: For those that are curious, I just did a quick test to see how much time g++ spends instantiating templates. I tried instantiating std::sort<int*> in a varying number of translation units, with and without the instantiation being suppressed. The result was conclusive: 30ms per instantiation of std::sort. There's definitely time to be saved here in a large project.

134

answered Oct 21 '22 09:10

Peter Alexander

Here is an interesting example :

#include <algorithm>
#include <string>

extern template class std::basic_string<char>;
int foo(std::string s)
{
    int res = s.length();
    res += s.find("some substring");
    return res;
}

When compiled with g++-7.2 at -O3, this produces a non-inlined call to string::find BUT an inlined call to string::size.

While without the extern template, everything is indeed inlined. Clang has the same behaviour and MSVC is almost unable to inline anything in any case.

So the anwser is : it depends, and compilers may have special heuristics for this.

answered Oct 21 '22 10:10

Jean-Michaël Celerier

Using extern template class does not seem to prevent inlining. I will illustrate this via an example, it is a bit involved but the simplest I can come up with.

In file a.h we define template class CFoo,

#ifndef A_H
#define A_H
#include <iostream>

template <typename T> class CFoo{
  public: CFoo(){
      std::cout << "CFoo Constructor, edit 0" << std::endl;
    }
};

extern template class CFoo<int>;
#endif

At the end of a.h we use extern template class CFoo<int> to indicate to any translation unit with #include a.h that it does not need to generate any code for CFoo. It's a promise we make that all things CFoo will link smoothly.

In file c.cpp we have,

#include "a.h"

void run(){
  CFoo<int> cf;
}

Due to the extern template class promise' at the end of a.h, the translation unit of c.cpp does notneed to' generate any code for class CFoo.

Finally we declare a main function in b.cpp,

void run();
int main(){
  run();
  return 0;
}

There is nothing fancy in b.cpp, we simply declare void run() which will be linked to the implementation of the translation unit b.cpp at link-time. For completeness, here is a makefile

cflags = -std=c++11 -O1

b : b.o a.o c.o
  g++ ${cflags} b.o a.o c.o -o b

b.o : b.cpp 
  g++ ${cflags} -c b.cpp -o b.o

c.o : c.cpp 
  g++ ${cflags} -c c.cpp -o c.o

a.o : a.cpp a.h
  g++ ${cflags} -c a.cpp -o a.o

clean:
  rm -rf a.o b.o c.o b

Using this makefile compiles and links an executable a which outputs ``CFoo Constructor, edit 0'' when run. But note! In the example above we do not seem to have declared CFoo<int> anywhere : CFoo<int> is definitely not declared on translation unit b.cpp as the header does not appear on that translation unit, and translation unit c.cpp was told that it didn't need to implement CFoo. So what's going on?

Make one change to the makefile : replace -O1 with -O0, and make clean make

Now, the link call results in an error (using gcc 4.8.4)

c.o: In function `run()':
c.cpp:(.text+0x10): undefined reference to `CFoo<int>::CFoo()'

This is the error which we would expect if there were no inlining in the first place. At least this is the conclusion I come to, further ideas are very welcome.

To get linking with -O1, we need to keep our promise and provide an implementation of CFoo, this we provide in file a.cpp

#include "a.h"
template void foo<int>();

We can now be guaranteed that CFoo appears on the translation unit of a.cpp, and our promise will be kept. As an aside, note that template void foo<int>() in a.cpp in preceded by extern template void foo<int>() via the inclusion of a.h, which is not problematic.

Finally, I find this unpredictable optimisation dependent behaviour annoying, as it means that modifications to a.h and recompilation of a.cpp might not be reflected in run() as expected if there were no inlining (try changing standard output of Foo constructor and remake).

answered Oct 21 '22 10:10

newling

Related questions
                            
                                What are some tricks I can use with macros? [closed]
                            
                                C++ const keyword - use liberally?
                            
                                Adding the ! operator and sqrt(), pow(), etc. to a calculator example application
                            
                                What does it mean for an object to exist in C++?
                            
                                Is std::swap(x, x) guaranteed to leave x unchanged?
                            
                                How to register C++ React Native module in Android
                            
                                Why does this code compile with gcc but not with clang
                            
                                Why is DFS slower in one tree and faster in the other?
                            
                                Why is my hand-tuned, SSE-enabled code so slow?
                            
                                Must template argument functions be treated as potentially constexpr?
                            
                                Lock-free Reference counting and C++ smart pointers
                            
                                Passing a C++ object to Python
                            
                                Symptoms and alternatives to overused OOP
                            
                                ctags in sublime text
                            
                                Separating C++ Class Code into Multiple Files, what are the rules?
                            
                                Traversing a tree during compile time, with visiting actions
                            
                                Dependent type or argument in decltype in function definition fails to compile when declared without decltype
                            
                                How to explicitly call a conversion function whose conversion-type-id contains a placeholder specifier
                            
                                Return type deduction for in-class friend functions
                            
                                Get CPU Temperature

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does extern template prevent inlining of functions?

Tags:

c++

c++11

templates