Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using `extern template` with third-party header-only library

I am using the glm library, which is a header-only collection of math utilities intended for 3D graphics. By using -ftime-trace on Clang and ClangBuildAnalyzer, I've noticed that a lot of time is being spent instantiating glm types:

**** Templates that took longest to instantiate:
 16872 ms: glm::vec<4, signed char, glm::packed_highp> (78 times, avg 216 ms)
 15675 ms: glm::vec<4, unsigned char, glm::packed_highp> (78 times, avg 200 ms)
 15578 ms: glm::vec<4, float, glm::packed_highp> (78 times, avg 199 ms)

...

So, I decided to create a wrapper header/source pair for glm, and use extern template to avoid unnecessary instantiations:

// glmwrapper.h

#pragma once

#include <glm.hpp>

extern template struct glm::vec<4, signed char, glm::packed_highp>;
extern template struct glm::vec<4, unsigned char, glm::packed_highp>;
extern template struct glm::vec<4, float, glm::packed_highp>;
// glmwrapper.cpp

template struct glm::vec<4, signed char, glm::packed_highp>;
template struct glm::vec<4, unsigned char, glm::packed_highp>;
template struct glm::vec<4, float, glm::packed_highp>;

Now, in my project, instead of including <glm.hpp>, I include "glmwrapper.h" instead. Unfortunately, that did not change anything. Using -ftime-trace and ClangBuildAnalyzer again reports the same number of instantiations. There also is no measurable compilation time difference.

I suspect that this is because #include <glm.hpp> does actually end up including the template definition, and at that point the subsequent extern template declarations are just redundant.

Is there a way to achieve what I want without modifying the glm library?


In pseudocode, I kinda want something like this:

// glmwrapper.h (psuedocode)

#pragma once

#include <glm.hpp>

// Make definition of the templates unavailable:
undefine template struct glm::vec<4, signed char, glm::packed_highp>;
undefine template struct glm::vec<4, unsigned char, glm::packed_highp>;
undefine template struct glm::vec<4, float, glm::packed_highp>;

// Make declaration of the templates available:
extern template struct glm::vec<4, signed char, glm::packed_highp>;
extern template struct glm::vec<4, unsigned char, glm::packed_highp>;
extern template struct glm::vec<4, float, glm::packed_highp>;
// glmwrapper.cpp (psuedocode)

// Define templates only in the `.cpp`, not in the header:
template struct glm::vec<4, signed char, glm::packed_highp>;
template struct glm::vec<4, unsigned char, glm::packed_highp>;
template struct glm::vec<4, float, glm::packed_highp>;
like image 487
Vittorio Romeo Avatar asked Apr 28 '20 09:04

Vittorio Romeo


Video Answer


1 Answers

Unfortunately, there’s no way to avoid these instantiations. An explicit instantiation declaration of a class template doesn’t prevent (implicit) instantiation of that template; it merely prevents instantiating its non-inline, non-template member functions (which is often none of them!) because some other translation unit will supply the actual function symbols and object code.

It’s not that seeing the template definition causes instantiation (which specialization would be instantiated?). The reason is that code which requires that the class be complete still needs to know its layout and member function declarations (for overload resolution), and in general there’s no way to know those short of instantiating the class:

template<class T> struct A : T::B {
  typename std::conditional<sizeof(T)<8,long,short>::type first;
  typename T::X second;
  A() noexcept(T::y)=default;  // perhaps deleted
  using T::B::foo;
  void foo(T);
  // and so on…
};

void f() {A<C> a; a.foo(a.first);}  // …maybe?

This “transparency” extends to several other kinds of templated entities as well: if compilation needs the definition of a template, the symbols generated for the linker are irrelevant.

The good news is that C++20’s modules should help with situations like this: an explicit instantiation definition in a module interface will cause a typical implementation to cache the instantiated class definition with the rest of the module interface data, avoiding both parsing and instantiation in importing translation units. Modules also remove the implicit inline on class members and friends defined in the class (which hasn’t meant much in a long time anyway), increasing the number (or, put differently, the convenience) of functions for which explicit instantiation declarations do prevent implicit instantiation.

like image 117
Davis Herring Avatar answered Sep 24 '22 01:09

Davis Herring