Suppose I have a class which contains a large number of other class declarations. Is it possible to spread the cost of these out somehow so that compile-time memory consumption doesn't grow quadratically for nested types? I'm willing to take a hit on compile time if need be, and would gladly divide this out into different translation units if that were an option.
To try and come up with a solution to this I've written the following program that illustrates a simplified version of the kind of code that leads to these blowouts:
// Add T to the list of args of SeqWithArgs N number of times:
template <int N, typename T, typename SeqWithArgs>
struct Append;
template <int N, typename T, template <typename...> class Seq, typename... Args>   
struct Append<N, T, Seq<Args...>>                                                  
{                                                                                  
    using type = typename Append<N-1, T, Seq<Args..., T>>::type;                   
};                                                                                 
template <typename T, template<typename...> class Seq, typename... Args>           
struct Append<0, T, Seq<Args...>>                                                  
{                                                                                  
    using type = Seq<Args...>;                                                     
};                                                                                 
static constexpr const int N = 10;                                              
// Tuple containing N instances of int
using Small = typename Append<N, int, std::tuple<>>::type;
// Tuple containing N instances of Small
using Big = typename Append<N, Small, std::tuple<>>::type;
// Tuple containing N instances of Big
using Huge = typename Append<N, Big, std::tuple<>>::type;
int main()
{                                                                               
    Huge h;                                                                     
    return 0;                                                                   
}
As righly pointed out by Yakk, these Append operations are horribly inefficient. However, in the real version of this code modifying these would require a fundamental restructure of the code.
I varied N from 10 to 70 and got this result compiling my program using GCC 4.8.1. I also ran time -v make to obtain peak resident memory usage. Here are the results using just the default flags:

This result seems excessive to me, not because of the shape (it is expected to be O(N^3) and seems to follow that shape), but because of the magnitude. It almost looks like Small is being expanded for every instantiation of Big, and Big in turn is being expanded for every instantiation of Huge. In less template-heavy code one would normally declare an generic specialisation of some type using the extern keyword and would therefore avoid this "nested expansion", but these are types, not values; does something similar exist for this kind of construct?
What is the cause of this blowout in memory and what can I do to reduce this memory footprint without changing the type of Small, Big and Huge?
Append generates Seq<T> Seq<T,T> ... Seq<T,...,T>.  Which is less of a problem than the fact it generates Append<n-1, T, Seq<T>>, which is a distinct and unique type for each N and each recursion.
It generates N unique types of total name length O(n^2*|T|), and the output type is of size O(n*|T|).
We then chain this.
Big generates types of total size O(n^2*O(n*|int|)), with end type size O(n^2|int|).  Huge types of size O(n^2*O(n^2|int|))=O(n^4|int|).
That is a lot of types generated.
70^4=5000^2=O(25 million) total type length.
We can do better with a less brain dead Append. Do it in three steps.
transcribe takes a t<Ts...> and a template<class...>class Seq and copies the Ts... over.
template<class...>struct t{using type=t;};
template<class src, template<class...>class dest>
struct transcribe;
template<class...Ts, template<class...>class dest>
struct transcribe<t<Ts...>,dest>{
  using type=dest<Ts...>;
};
template<class src, template<class...>class dest>
using transcribe_t=typename transcribe<src, dest>::type;
append takes any number of t<...>s and appends them.
template<class... ts>
struct append;
template<>struct append<>{using type=t<>;};
template<class...Ts>struct append<t<Ts...>>{using type=t<Ts...>;};
template<class... Ts, class... Us, class... Zs>
struct append<t<Ts...>,t<Us...>,Zs....>:
  append<t<Ts...,Us...>,Zs...>
{};
template<class...ts>
using append_t=typename append<ts...>::type;
breaker takes an unsigned value N and breaks it down to powers of two, outputting 
v<0,1,3,4> (2^0+2^1+2^3+2^4) for 26.
template<unsigned...>struct v{using type=v;};
template<unsigned X, class V=v<>, unsigned B=0, class=void>
struct breaker;
template<unsigned X, unsigned...vs, unsigned B>
struct breaker<X, v<vs...>, B, typename std::enable_if<
  X&(1<<B)
>::type>:breaker<X&~(1<<B), v<vs...,B>, B+1>
{};
template<unsigned X, unsigned...vs, unsigned B>
struct breaker<X, v<vs...>, B, typename std::enable_if<
  !(X&(1<<B))
>::type>:breaker<X&~(1<<B), v<vs...>, B+1>
{};
template<unsigned X, unsigned...vs, unsigned B>
struct breaker<0, v<vs...>, B, void>
{
  using type=v<vs...>;
};
template<unsigned X>
using breaker_t=typename breaker<X>::type;
Build takes an unsigned value N and a T.  It Breaks the N.  Then we build the power of twos into t<T,T,T,...,T>s.  Then we append them.
Then we take the output and transcribe to Seq<...>.
This generates O(N*logN*logN) types.  So for large N may be better.  Plus most of the types generated are small and simple, which is a plus.
At best this reduces your load by a factor of 10. Worth a try.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With