Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Switch statement variadic template expansion

Let me please consider the following synthetic example:

inline int fun2(int x) {
    return x;
}
inline int fun2(double x) {
    return 0;   
}
inline int fun2(float x) {
    return -1;   
}

int fun(const std::tuple<int,double,float>& t, std::size_t i) {
    switch(i) {
        case 0: return fun2(std::get<0>(t));
        case 1: return fun2(std::get<1>(t));
        case 2: return fun2(std::get<2>(t));
    }    
}

The question is how should I expand this to the general case

template<class... Args> int fun(const std::tuple<Args...>& t, std::size_t i) {
// ?
}

Guaranteeing that

  1. fun2 can be inlined into fun
  2. search complexity not worse than O(log(i)) (for large i).

It is known that optimizer usually uses lookup jump table or compile-time binary search tree when large enough switch expanded. So, I would like to keep this property affecting performance for large number of items.

Update #3: I remeasured performance with uniform random index value:

                      1       10      20      100
@TartanLlama
    gcc               ~0      42.9235 44.7900 46.5233
    clang             10.2046 38.7656 40.4316 41.7557
@chris-beck
    gcc               ~0      37.564  51.3653 81.552
    clang             ~0      38.0361 51.6968 83.7704
naive tail recursion
    gcc                3.0798 40.6061 48.6744 118.171
    clang             11.5907 40.6197 42.8172 137.066
manual switch statement
    gcc                       41.7236 
    clang                      7.3768 

Update #2: It seems that clang is able to inline functions in @TartanLlama solution whereas gcc always generates function call.

like image 237
0x2207 Avatar asked Aug 26 '15 20:08

0x2207


2 Answers

Ok, I rewrote my answer. This gives a different approach to what TartanLlama and also what I suggested before. This meets your complexity requirement and doesn't use function pointers so everything is inlineable.

Edit: Much thanks to Yakk for pointing out a quite significant optimization (for the compile-time template recursion depth required) in comments

Basically I make a binary tree of the types / function handlers using templates, and implement the binary search manually.

It might be possible to do this more cleanly using either mpl or boost::fusion, but this implementation is self-contained anyways.

It definitely meets your requirements, that the functions are inlineable and runtime look up is O(log n) in the number of types in the tuple.

Here's the complete listing:

#include <cassert>
#include <cstdint>
#include <tuple>
#include <iostream>

using std::size_t;

// Basic typelist object
template<typename... TL>
struct TypeList{
   static const int size = sizeof...(TL);
};

// Metafunction Concat: Concatenate two typelists
template<typename L, typename R>
struct Concat;

template<typename... TL, typename... TR>
struct Concat <TypeList<TL...>, TypeList<TR...>> {
    typedef TypeList<TL..., TR...> type;
};

template<typename L, typename R>
using Concat_t = typename Concat<L,R>::type;

// Metafunction First: Get first type from a typelist
template<typename T>
struct First;

template<typename T, typename... TL>
struct First <TypeList<T, TL...>> {
    typedef T type;
};

template<typename T>
using First_t = typename First<T>::type;


// Metafunction Split: Split a typelist at a particular index
template<int i, typename TL>
struct Split;

template<int k, typename... TL>
struct Split<k, TypeList<TL...>> {
private:
    typedef Split<k/2, TypeList<TL...>> FirstSplit;
    typedef Split<k-k/2, typename FirstSplit::R> SecondSplit;
public:
    typedef Concat_t<typename FirstSplit::L, typename SecondSplit::L> L;
    typedef typename SecondSplit::R R;
};

template<typename T, typename... TL>
struct Split<0, TypeList<T, TL...>> {
    typedef TypeList<> L;
    typedef TypeList<T, TL...> R;
};

template<typename T, typename... TL>
struct Split<1, TypeList<T, TL...>> {
    typedef TypeList<T> L;
    typedef TypeList<TL...> R;
};

template<int k>
struct Split<k, TypeList<>> {
    typedef TypeList<> L;
    typedef TypeList<> R;
};


// Metafunction Subdivide: Split a typelist into two roughly equal typelists
template<typename TL>
struct Subdivide : Split<TL::size / 2, TL> {};

// Metafunction MakeTree: Make a tree from a typelist
template<typename T>
struct MakeTree;

/*
template<>
struct MakeTree<TypeList<>> {
    typedef TypeList<> L;
    typedef TypeList<> R;
    static const int size = 0;
};*/

template<typename T>
struct MakeTree<TypeList<T>> {
    typedef TypeList<> L;
    typedef TypeList<T> R;
    static const int size = R::size;
};

template<typename T1, typename T2, typename... TL>
struct MakeTree<TypeList<T1, T2, TL...>> {
private:
    typedef TypeList<T1, T2, TL...> MyList;
    typedef Subdivide<MyList> MySubdivide;
public:
    typedef MakeTree<typename MySubdivide::L> L;
    typedef MakeTree<typename MySubdivide::R> R;
    static const int size = L::size + R::size;
};

// Typehandler: What our lists will be made of
template<typename T>
struct type_handler_helper {
    typedef int result_type;
    typedef T input_type;
    typedef result_type (*func_ptr_type)(const input_type &);
};

template<typename T, typename type_handler_helper<T>::func_ptr_type me>
struct type_handler {
    typedef type_handler_helper<T> base;
    typedef typename base::func_ptr_type func_ptr_type;
    typedef typename base::result_type result_type;
    typedef typename base::input_type input_type;

    static constexpr func_ptr_type my_func = me;
    static result_type apply(const input_type & t) {
        return me(t);
    }
};

// Binary search implementation
template <typename T, bool b = (T::L::size != 0)>
struct apply_helper;

template <typename T>
struct apply_helper<T, false> {
    template<typename V>
    static int apply(const V & v, size_t index) {
        assert(index == 0);
        return First_t<typename T::R>::apply(v);
    }
};

template <typename T>
struct apply_helper<T, true> {
    template<typename V>
    static int apply(const V & v, size_t index) {
        if( index >= T::L::size ) {
            return apply_helper<typename T::R>::apply(v, index - T::L::size);
        } else {
            return apply_helper<typename T::L>::apply(v, index);
        }
    }
};

// Original functions

inline int fun2(int x) {
    return x;
}
inline int fun2(double x) {
    return 0;   
}
inline int fun2(float x) {
    return -1;   
}

// Adapted functions
typedef std::tuple<int, double, float> tup;

inline int g0(const tup & t) { return fun2(std::get<0>(t)); }
inline int g1(const tup & t) { return fun2(std::get<1>(t)); }
inline int g2(const tup & t) { return fun2(std::get<2>(t)); }

// Registry

typedef TypeList<
   type_handler<tup, &g0>,
   type_handler<tup, &g1>,
   type_handler<tup, &g2>
> registry;

typedef MakeTree<registry> jump_table;

int apply(const tup & t, size_t index) {
    return apply_helper<jump_table>::apply(t, index);
}

// Demo

int main() {
    {
        tup t{5, 1.5, 15.5f};

        std::cout << apply(t, 0) << std::endl;
        std::cout << apply(t, 1) << std::endl;
        std::cout << apply(t, 2) << std::endl;
    }

    {
        tup t{10, 1.5, 15.5f};

        std::cout << apply(t, 0) << std::endl;
        std::cout << apply(t, 1) << std::endl;
        std::cout << apply(t, 2) << std::endl;
    }

    {
        tup t{15, 1.5, 15.5f};

        std::cout << apply(t, 0) << std::endl;
        std::cout << apply(t, 1) << std::endl;
        std::cout << apply(t, 2) << std::endl;
    }

    {
        tup t{20, 1.5, 15.5f};

        std::cout << apply(t, 0) << std::endl;
        std::cout << apply(t, 1) << std::endl;
        std::cout << apply(t, 2) << std::endl;
    }
}

Live on Coliru: http://coliru.stacked-crooked.com/a/3cfbd4d9ebd3bb3a

like image 92
Chris Beck Avatar answered Oct 17 '22 15:10

Chris Beck


If you make fun2 into a class with overloaded operator():

struct fun2 {
    inline int operator()(int x) {
        return x;
    }
    inline int operator()(double x) {
        return 0;   
    }
    inline int operator()(float x) {
        return -1;   
    }
};

then we can modify dyp's answer from here to work for us.

Note that this would look a lot neater in C++14, as we could have all the return types deduced and use std::index_sequence.

//call the function with the tuple element at the given index
template<class Ret, int N, class T, class Func>
auto apply_one(T&& p, Func func) -> Ret
{
    return func( std::get<N>(std::forward<T>(p)) );
}

//call with runtime index
template<class Ret, class T, class Func, int... Is>
auto apply(T&& p, int index, Func func, seq<Is...>) -> Ret
{
    using FT = Ret(T&&, Func);
    //build up a constexpr array of function pointers to index
    static constexpr FT* arr[] = { &apply_one<Ret, Is, T&&, Func>... };
    //call the function pointer at the specified index
    return arr[index](std::forward<T>(p), func);
}

//tag dispatcher
template<class Ret, class T, class Func>
auto apply(T&& p, int index, Func func) -> Ret
{
    return apply<Ret>(std::forward<T>(p), index, func, 
                      gen_seq<std::tuple_size<typename std::decay<T>::type>::value>{});
}

We then call apply and pass the return type as a template argument (you could deduce this using decltype or C++14):

auto t = std::make_tuple(1,1.0,1.0f);
std::cout << apply<int>(t, 0, fun2{}) << std::endl;
std::cout << apply<int>(t, 1, fun2{}) << std::endl;
std::cout << apply<int>(t, 2, fun2{}) << std::endl;

Live Demo

I'm not sure if this will completely fulfil your requirements due to the use of function pointers, but compilers can optimize this kind of thing pretty aggressively. The searching will be O(1) as the pointer array is just built once then indexed directly, which is pretty good. I'd try this out, measure, and see if it'll work for you.

like image 31
TartanLlama Avatar answered Oct 17 '22 14:10

TartanLlama