Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::sort without copy construction

Assume that I have a vector of objects where:

  • copy construction and assignments are expensive
  • default construction and swapping of two objects is cheap.

This seems quite standard for objects with references to big data -- for example a vector of vectors.

Question: Is there a way to sort this vector using std::sort, or some other sorting routine from the standard library, so that no copying takes place, but swap is used instead? I'm looking for a pre-c++0x solution (no move semantics).

Overloading the std::swap seemed like a first natural attempt, and it does help a bit, but it gets rid of only a small fraction of the copying.

Note: Example of gcc behaviour

To sort 100 81 64 49 36 25 16 9 4 1 0 1 4 9 16 25 36 49 64 81, my gcc std::sort calls into 19 copy constructors, 92 assignments, and 6 swaps.

like image 483
Dejan Jovanović Avatar asked Jan 06 '13 01:01

Dejan Jovanović


People also ask

Is std::sort () stable?

A sorting algorithm is “stable” if, for two equivalent elements, it preserves their original order relative to each other. It might be useful to know a concrete example of input for which your library's std::sort is unstable.

Is std::sort fast?

According to Scott Meyers, in his Effective STL book - item 46. He claimed that std::sort is about 670% faster than std::qsort due to the fact of inline.

Why does std :: list need a special sort method?

std::sort is a generic algorithm that's supposed to work for anything that provides iterators. However, it really expects a random access iterator to sort efficiently. std::list , being a linked list, cannot provide random access to its elements efficiently. That's why it provides its own specialized sort algorithm.

How is std::sort implemented?

The std::sort is a sorting function that uses the Introsort algorithm and have the complexity of O(N log(N)) where N= std::distance(first, last) since C++11 and the order of equal elements is not guaranteed to be preserved[3]. The gcc-libstdc++ also uses Introsort algorithm.


2 Answers

// C++03 solution won't work with arrays and some other custom containers.
// Mostly drop this block:
#include <type_traits>
#include <vector>
#include <algorithm>
#include <iostream>
namespace aux {
  using std::begin; using std::end;
  template<typename C> auto adl_begin( C&& c )->decltype( begin(c) );
  template<typename C> auto adl_end( C&& c )->decltype( end(c) );

  template<typename C>
  struct container_traits:
    std::iterator_traits< typename std::decay< decltype( aux::adl_begin( *(C*)nullptr ) ) >::type >
  {
    typedef typename std::decay< decltype( adl_begin( *(C*)nullptr ) ) >::type iterator_type;
  };
}

// C++03 solution won't work with arrays.  Inside std::less, use Container::value_type:
template<
  typename Container,
  typename Comparison = std::less<
    typename aux::container_traits<Container>::value_type
  >
>
void indirect_sort_then_swap( Container& c, Comparison&& comp = Comparison() ) {
  typedef aux::container_traits<Container> con_traits;
  typedef typename con_traits::value_type value_type;
  typedef typename con_traits::iterator_type iterator_type;
  std::vector< iterator_type > indirect;
  {
    // C++03 solution can use c.begin(), but will not work with arrays:
    using std::begin; using std::end;
    auto begin_ = begin(c);
    auto end_ = end(c);
    for( auto it = begin_; it != end_; ++it ) {
      indirect.push_back( it );
    }
  }
  // In C++03, write a functor class that does this:
  auto indirect_sort = [&comp]( iterator_type const& left, iterator_type const& right )->bool {
    return comp(*left, *right);
  };
  std::sort( indirect.begin(), indirect.end(), indirect_sort );
  // at this point, indirect is a vector with the contents of c sorted by iterator:
  // a hard part remains, namely to take this information and sort c with minimal swaps
  // That is hard.  I will instead create an easy approach, namely create an empty
  // copy of c full of empty elements, and directly swap the correct entry of c into
  // each slot, then I swap c with its copy.
  // the downside is that my container now needs to support push_back.  Oh well.
  Container c2;
  // C++03 solution cannot use auto here.  But we know the type of indirect:
  for (auto it = indirect.begin(); it != indirect.end(); ++it) {
    // See previous comment
    auto itv = *it;
    c2.push_back( value_type() );
    using std::swap;
    swap( *itv, c2.back() );
  }
  // by this point, the contents of c have been swap-moved to c2
  // swap them back:
  {
    using std::swap;
    swap( c, c2 );
  }
}

int main() {
   std::vector<int> foo;
   foo.push_back(7);
   foo.push_back(3);
   indirect_sort_then_swap(foo);
   for (auto i:foo) {
      std::cout << i << "\n";
   }
}

something like the above is a viable approach. I wrote a bunch of it in C++11, but included comments on how to strip out the extra C++11 stuff (it actually simplifies the code in some cases, but removes the ability to handle some container-like things).

The basic idea is to sort a vector of iterators into your original container. Then we create a temporary container, stuff trivial value_types into it, swap those trivial value_types with the correct data from the original container (as determined by the vector of sorted iterators), then swap this temporary container for our original container.

There is lots of allocation, but hopefully of cheap stuff.

For this to work, the data you are sorting needs be trivial constructable. For this to be efficient, the data you are working with when trivially constructed needs to be cheap, and swap needs to be efficient.

I attempted to make this as ADL friendly as I can, because I find that to be good practice.

like image 109
Yakk - Adam Nevraumont Avatar answered Sep 29 '22 00:09

Yakk - Adam Nevraumont


Heap-sort is a swap-only sort which is not stable (the order of equivalent elements may change during the sorting). I answered an other similar question where I implemented the heap-sort myself (PasteBin), but you may find better and more flexible implementations out there.

The conclusion was that g++'s std::sort used 35 copy, 19 assignment, 10 swap and 35 delete (99 operations altogether) for 20 elements, my heap sort used 62 swaps and nothing else.

I just bumped into a stable sort which uses only swap here on stackoverflow. I did not look into it deeper.

like image 22
Notinlist Avatar answered Sep 29 '22 01:09

Notinlist