I am using the glm
library, which is a header-only collection of math utilities intended for 3D graphics. By using -ftime-trace
on Clang and ClangBuildAnalyzer
, I've noticed that a lot of time is being spent instantiating glm
types:
**** Templates that took longest to instantiate:
16872 ms: glm::vec<4, signed char, glm::packed_highp> (78 times, avg 216 ms)
15675 ms: glm::vec<4, unsigned char, glm::packed_highp> (78 times, avg 200 ms)
15578 ms: glm::vec<4, float, glm::packed_highp> (78 times, avg 199 ms)
...
So, I decided to create a wrapper header/source pair for glm
, and use extern template
to avoid unnecessary instantiations:
// glmwrapper.h
#pragma once
#include <glm.hpp>
extern template struct glm::vec<4, signed char, glm::packed_highp>;
extern template struct glm::vec<4, unsigned char, glm::packed_highp>;
extern template struct glm::vec<4, float, glm::packed_highp>;
// glmwrapper.cpp
template struct glm::vec<4, signed char, glm::packed_highp>;
template struct glm::vec<4, unsigned char, glm::packed_highp>;
template struct glm::vec<4, float, glm::packed_highp>;
Now, in my project, instead of including <glm.hpp>
, I include "glmwrapper.h"
instead. Unfortunately, that did not change anything. Using -ftime-trace
and ClangBuildAnalyzer
again reports the same number of instantiations. There also is no measurable compilation time difference.
I suspect that this is because #include <glm.hpp>
does actually end up including the template definition, and at that point the subsequent extern template
declarations are just redundant.
Is there a way to achieve what I want without modifying the glm
library?
In pseudocode, I kinda want something like this:
// glmwrapper.h (psuedocode)
#pragma once
#include <glm.hpp>
// Make definition of the templates unavailable:
undefine template struct glm::vec<4, signed char, glm::packed_highp>;
undefine template struct glm::vec<4, unsigned char, glm::packed_highp>;
undefine template struct glm::vec<4, float, glm::packed_highp>;
// Make declaration of the templates available:
extern template struct glm::vec<4, signed char, glm::packed_highp>;
extern template struct glm::vec<4, unsigned char, glm::packed_highp>;
extern template struct glm::vec<4, float, glm::packed_highp>;
// glmwrapper.cpp (psuedocode)
// Define templates only in the `.cpp`, not in the header:
template struct glm::vec<4, signed char, glm::packed_highp>;
template struct glm::vec<4, unsigned char, glm::packed_highp>;
template struct glm::vec<4, float, glm::packed_highp>;
Unfortunately, there’s no way to avoid these instantiations. An explicit instantiation declaration of a class template doesn’t prevent (implicit) instantiation of that template; it merely prevents instantiating its non-inline, non-template member functions (which is often none of them!) because some other translation unit will supply the actual function symbols and object code.
It’s not that seeing the template definition causes instantiation (which specialization would be instantiated?). The reason is that code which requires that the class be complete still needs to know its layout and member function declarations (for overload resolution), and in general there’s no way to know those short of instantiating the class:
template<class T> struct A : T::B {
typename std::conditional<sizeof(T)<8,long,short>::type first;
typename T::X second;
A() noexcept(T::y)=default; // perhaps deleted
using T::B::foo;
void foo(T);
// and so on…
};
void f() {A<C> a; a.foo(a.first);} // …maybe?
This “transparency” extends to several other kinds of templated entities as well: if compilation needs the definition of a template, the symbols generated for the linker are irrelevant.
The good news is that C++20’s modules should help with situations like this: an explicit instantiation definition in a module interface will cause a typical implementation to cache the instantiated class definition with the rest of the module interface data, avoiding both parsing and instantiation in importing translation units. Modules also remove the implicit inline
on class members and friends defined in the class (which hasn’t meant much in a long time anyway), increasing the number (or, put differently, the convenience) of functions for which explicit instantiation declarations do prevent implicit instantiation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With