Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting zipped (locked) containers in C++ using boost or the STL

Here's a working example based on the range-v3 Library that has been proposed for Standardization

#include <range/v3/all.hpp>
#include <iostream>

using namespace ranges;

int main() 
{
    std::vector<int> a1{15, 7, 3,  5};
    std::vector<int> a2{ 1, 2, 6, 21};
    sort(view::zip(a1, a2), std::less<>{}, &std::pair<int, int>::first); 
    std::cout << view::all(a1) << '\n';
    std::cout << view::all(a2) << '\n';
}

Live Example (requires recent compiler with good C++14 support, not VS 2015).


For the two container case, here's a version that compiles on gcc 4.4.6, based on the paper referred to above. In later versions of gcc, you can swap out boost::tuple for std::tuple

#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>

# include <boost/iterator/iterator_facade.hpp>
# include <boost/tuple/tuple.hpp> 

using namespace std;

template <class T, class T2>
struct helper_type {
  typedef boost::tuple<typename iterator_traits<T>::value_type, typename iterator_traits<T2>::value_type> value_type;
  typedef boost::tuple<typename iterator_traits<T>::value_type&, typename iterator_traits<T2>::value_type&> ref_type;
};

template <typename T1, typename T2>
class dual_iterator : public boost::iterator_facade<dual_iterator<T1, T2>,
                                                    typename helper_type<T1, T2>::value_type,
                                                    boost::random_access_traversal_tag,
                                                    typename helper_type<T1, T2>::ref_type> {
public:
   explicit dual_iterator(T1 iter1, T2 iter2) : mIter1(iter1), mIter2(iter2) {}
   typedef typename iterator_traits<T1>::difference_type difference_type;
private:
   void increment() { ++mIter1; ++mIter2; }
   void decrement() { --mIter1; --mIter2; }
   bool equal(dual_iterator const& other) const { return mIter1 == other.mIter1; }
   typename helper_type<T1, T2>::ref_type dereference() const { return (typename helper_type<T1, T2>::ref_type(*mIter1, *mIter2)); }
   difference_type distance_to(dual_iterator const& other) const { return other.mIter1 - mIter1; }
   void advance(difference_type n) { mIter1 += n; mIter2 += n; }

   T1 mIter1;
   T2 mIter2;
   friend class boost::iterator_core_access;
};

template <typename T1, typename T2>
dual_iterator<T1, T2> make_iter(T1 t1, T2 t2) { return dual_iterator<T1, T2>(t1, t2); }

template <class T1, class T2> struct iter_comp {
  typedef typename helper_type<T1, T2>::value_type T;
  bool operator()(const T& t1, const T& t2) { return get<0>(t1) < get<0>(t2); }
};

template <class T1, class T2> iter_comp<T1, T2> make_comp(T1 t1, T2 t2) { return iter_comp<T1, T2>(); }

template<class T> void print(T& items) {
  copy(items.begin(), items.end(), ostream_iterator<typename T::value_type>(cout, " ")); cout << endl;
}

int main() {
  vector<double> nums1 = {3, 2, 1, 0};
  vector<char> nums2 = {'D','C', 'B', 'A'};
  sort(make_iter(nums1.begin(), nums2.begin()), 
       make_iter(nums1.end(), nums2.end()), 
       make_comp(nums1.begin(), nums2.begin()));
  print(nums1);
  print(nums2);
}

Create an auxillary array that contains the indices 0..N-1. Sort that array with a custom comparator that actually returns results from comparing elements of one of your primary arrays. Then use the sorted auxillary array to print out your primary arrays in the right order.


Nice to meet a fellow Internet Archaeologist !

How do I sort N vectors in lock step (zipped)? The problem looks pretty generic and common so I guess there must be an easy solution through a probably very complex library but I just can't find it.

Sometimes ago, I went on the same treasure hunt with similar assumptions...
Never found the treasure :(

I followed the same tracks as you :

  • Go through the usual suspects boost.iterator/boost.range/boost.fusion/boost.oven and after quite a lot of experimentation and research realize that they can't solve this particular problem.
  • Go through many question on SO, only to realize that every single one have been closed either with incorrect answer (for example recommending boost::zip_iterator which doesn't work in this case as you point out), or with some workaround that avoid the heart of the matter.
  • Go through many blog post, mailing list, only to realize that nobody actually solved the problem except....
  • After much research finally dig up the old codex by Antonius Wilhelm, who claim to have crafted a general solution "TupleIterator" and locked it inside some archive "tupleit.zip". Historical sources are so scarce on this one, that I'm still not sure if this archive is a myth, a legend, or if it still buried somewhere in a lost layer of the internet :)

Ok, more seriously, the paper by Anthony Williams suggest that the problem is actually really hard, so it's not that surprising to find that no existing library like boost solved it.


I'm pleased to say that I've found a solution, after a similar treasure hunt. range-v3 is a great idea if you can use it, but if you really need an iterator, the HPX project has created one and it works perfectly with sort.

Owing to an oversight, which will hopefully be repaired, it still requires you to link with the HPX library but that's OK for me because the whole point was to use C++17 parallel algorithms, which HPX provides an implementation of.

#include <hpx/util/zip_iterator.hpp>

using zip_it = 
    hpx::util::zip_iterator<std::vector<std::size_t>::iterator,
                            std::vector<std::size_t>::iterator,
                            std::vector<double>::iterator>;

int main() {
    std::vector<std::size_t> rows{3, 2, 1};
    std::vector<std::size_t> cols{1, 2, 3};
    std::vector<double> values{4.0, 5.0, 6.0};

    auto start = hpx::util::make_zip_iterator(rows.begin(), cols.begin(), values.begin());
    auto stop  = hpx::util::make_zip_iterator(rows.end(), cols.end(), values.end());

    std::sort(start, stop);

    for ( int i = 0; i < 3; ++i ) {
        std::cerr << rows[i] << ", " << cols[i] << ", " << values[i] << "\n";
    }
}