A common compiler optimization is to transform tail-recursive functions into loops, speeding up the execution time and reducing memory (stack) consumption:
int go_to_zero( int n )
{
if( n == 0 )
return 0;
else
return go_to_zero( n - 1 );
}
My question is simple: Is there any performance benefit (i.e. reducing compile time) doing tail-recursive algorithms on template meta-programming?
Here an example:
template<typename... Ts>
struct list {};
template<typename LIST>
struct reverse;
template<typename HEAD , typename... TAIL>
struct reverse<list<HEAD,TAIL...>>
{
using result = concat<typename reverse<list<TAIL...>>::result,list<HEAD>>;
};
template<>
struct reverse<list<>>
{
using result = list<>;
};
versus:
template<typename INPUT , typename OUTPUT>
struct reverse_impl;
template<typename HEAD , typename... TAIL , tyename... Ts>
struct reverse_impl<list<HEAD,TAIL...>,list<Ts...>>
{
using result = typename reverse_impl<list<TAIL...>,list<Ts...,HEAD>>::result;
};
template<typename... Ts>
struct reverse_impl<list<>,list<Ts...>>
{
using result = list<Ts...>;
};
template<typename LIST>
using reverse = typename reverse_impl<LIST,list<>>::result;
In C++ Template Metaprogramming Appendix C "Compile Time Performance" Abraham and Gurtovoy messured how memoization of template instantiation effects compile time. The book was written in 2005 and I think memoization is better implemented today (I didn't measure). So the question to answer is which implementation can benefit more from memoization. I have edited the code a little bit Version 1 and Version 2. Now it will emit an error when reverse_impl
is instantiated, so we can simply count the errors. I reverse 2 lists list<int, short, char>
and list<short, char>
. Version 1 emits 4 errors and Version 2 emits 7 errors. In this particular case Version 1 benefits from memoization with these two lists (list2 is the tail of list1) and Version 2 doesn't. So I would expect Version 1 is faster.
So it would be best to implement algorithms in terms of other algorithms which are known to benefit from memoization and I think most of them use head recursion.
All the pros and cons of regular vs. tail recursion apply also to the metaprogramming. The difference in this case is that instead of the execution of a compiled program on a target machine, the program is executed in a compiler sand-box and compiles to the target language instead of to the machine language. This process is conceptually not much different from compiling a Java program into a bytecode.
C++ compilers have rather limited allowed depth of template nesting (~hundreds) and if the algorithm grows exponentially, it can be a blocker to usage of regular recursion; tail recursion might help here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With