Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to: Extend C++14 template function to variadic template, arguments

I'm a returning C++ programmer who has been away from the language for several years (C++11 had just started gaining real traction when I was last active in the language). I've been actively developing data science apps in Python for the past few. As a learning exercise to get back up to speed I decided to implement Python's zip() function in C++14 and now have a working function that can take any two STL (and a few others) containers holding any types and "zip" them into a vector of tuples:

template <typename _Cont1, typename _Cont2>
auto pyzip(_Cont1&& container1, _Cont2&& container2) {
    using std::begin;
    using std::end;
    
    using _T1 = std::decay_t<decltype(*container1.begin())>;
    using _T2 = std::decay_t<decltype(*container2.begin())>;

    auto first1 = begin(std::forward<_Cont1>(container1));
    auto last1 = end(std::forward<_Cont1>(container1));
    auto first2 = begin(std::forward<_Cont2>(container2));
    auto last2 = end(std::forward<_Cont2>(container2));
    
    std::vector<std::tuple<_T1, _T2>> result;
    result.reserve(std::min(std::distance(first1, last1), std::distance(first2, last2)));
    for (; first1 != last1 && first2 != last2; ++first1, ++first2) {
        result.push_back(std::make_tuple(*first1, *first2));
    }
    return result;
}

For example, the following code (taken from a code cell in Jupyter notebook running the xeus-cling C++14 kernel)

#include <list>
#include <xtensor/xarray.hpp>

list<int> v1 {1, 2, 3, 4, 5};
xt::xarray<double> v2 {6.01, 7.02, 8.03};
auto zipped = pyzip(v1, v2);

for (auto tup: zipped)
    cout << '(' << std::get<0>(tup) << ", " << std::get<1>(tup) << ") ";

produces this output:

(1, 6.01) (2, 7.02) (3, 8.03)

I want to extend my function to take an arbitrary number of containers of arbitrary types, and I've spent some time researching variadic templates, but to my embarrassment I'm just not connecting the dots. How could I generalize this function to take an arbitrary number of arbitrary container types holding arbitrary data types? I'm not necessarily looking for the exact code that I would need, but I could really use some help wrapping my head around how to exploit variadic templates in this scenario.

Also, any critiques of the code I do have would be appreciated.

like image 721
jboley-arnl Avatar asked Mar 01 '23 23:03

jboley-arnl


1 Answers

Variadic templates have a mechanism not too dissimilar to Python's ability to pass a function positional arguments and to then expand those positional arguments into a sequence of values. C++'s mechanism is a bit more powerful and more pattern based.

So let's take it from the top. You want to take an arbitrary series of ranges (containers is too limiting):

template <typename ...Ranges>
auto pyzip(Ranges&& ...ranges)

The use of ... here specifies the declaration of a pack. This particular function declaration declares two "packs": a pack of types named Ranges, and a parameter pack named ranges.

So, the first thing you need to do is get a series of begin and end iterators. Since these iterators can be of arbitrary types, an array won't do; it has to be stored in a tuple. Each element of the tuple needs to be initialized by taking ranges, getting that element, and calling begin on it. Here's how you do that:

auto begin_its = std::make_tuple(begin(std::forward<Ranges>(ranges))...);

This use of ... is called a pack expansion. The expression to its left contains one or more packs. And the ... takes that expression and turns it into a comma-separated sequence of values, substituting each corresponding member of the pack where the pack is listed. The expression being expanded is begin(std::forward<Ranges>(ranges)). And we use both ranges and Ranges here, so both packs are expanded together (and must be of the same size).

We expand this pack into the arguments for make_tuple, so that function gets one parameter for each element in the packs.

Of course, the same thing works for end too.

Next, you want to store copies(?) of the elements from each range in a vector<tuple>. Well, this requires that we first figure out what the value type of the range is. That's easy enough using another pack expansion over the typedef you used in your example:

using vector_elem = std::tuple<std::decay_t<decltype(*begin(std::forward<Ranges>(ranges)))>...>;
std::vector<vector_elem> result;

Note that in this case, the ... doesn't apply to an "expression", but it does the same thing: repeating the std::decay_t part for each element of ranges.

Next, we need to compute the size of our eventual list. That's... actually surprisingly difficult. One might think that you could just use begin_its and end_its, and just iterate over them, or use some pack expansion shenanigans with them. But no, C++ doesn't let you do either of those. Those tuples aren't packs, and you can't (easily) treat them as such.

It's actually easier to just recompute the begin/end iterators and take the difference, all in one expression.

auto size = std::min({std::distance(begin(std::forward<Ranges>(ranges)), end(std::forward<Ranges>(ranges)))...});
result.reserve(std::size_t(size));

Well, "easier" in terms of lines of code, not so much readability;

std::min here takes an initializer-list of values to compute the minimum of.

For our loop, rather than going until the iterators have reached an end state, it's easier to just loop over a count.

But that's really just pushing off the final problem. Namely, we have this tuple of iterators, and we need to perform 2 operations on their members: dereferencing, and incrementing. And it doesn't really matter which one we're doing; it's equally hard in C++.

Oh, it's perfectly doable. You just need a new function.

See, you cannot access an element of a tuple with a runtime index. And you cannot loop over compile-time values. So you need some way to get a pack containing, not parameters or types, but integer indices. This pack of indices can be unpacked into the get<Index> call for the tuple to access its contents.

C++17 gave us a convenient std::apply function for doing exactly this kind of thing. Unfortunately, this is C++14, so we have to write one:

namespace detail {
template <class F, class Tuple, std::size_t... I>
constexpr decltype(auto) apply_impl(F&& f, Tuple&& t, std::index_sequence<I...>)
{
    return f(std::get<I>(std::forward<Tuple>(t))...);
}
}  // namespace detail
 
template <class F, class Tuple>
constexpr decltype(auto) apply(F&& f, Tuple&& t)
{
    return detail::apply_impl(
        std::forward<F>(f), std::forward<Tuple>(t),
        std::make_index_sequence<std::tuple_size<std::remove_reference_t<Tuple>>::value>{});
}

apply here takes a function and a tuple, and unpacks the tuple into the function's arguments, returning whatever the function returns.

So, back in our function, we can use apply to do what we need: indirection, then incrementing of each element. And generic lambdas allow us to deal effectively with inserting the values into the vector through emplace_back:

for (decltype(size) ix = 0; ix < size; ++ix)
{
  apply([&result](auto&& ...its) mutable
  {
    result.emplace_back(*its...); //No need for redundant `make_tuple`+copy.
  }, begin_its);

  apply([](auto& ...its)
  {
    int unused[] = {0, (++its, 0)...};
  }, begin_its);
}

unused and its initializer is a confused way for C++ to just execute an expression on every item in a pack, discarding the results. Unfortunately, it's the most straightforward way to do that in C++14.

Here's the whole working example.

like image 200
Nicol Bolas Avatar answered Mar 05 '23 09:03

Nicol Bolas