Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reason for existence of sort_heap

Tags:

c++

sorting

stl

While browsing the less known parts of the standard library, I stumbled onto std::sort_heap. But I don't understand why does it exist since there is a free function named std::sort.

Also please note that complexities are the same.

So, my question is: what is the rationale for the existence of sort_heap?

like image 238
NoSenseEtAl Avatar asked Oct 08 '12 17:10

NoSenseEtAl


3 Answers

Code size is a good reason to use the heap sort. These are template functions; for each combination of type being sorted and compare function, you get a full-blown sort implementation from std::sort (i.e. no part of the code for sorting one case is shared with the code for sorting a different case - even if it's the same type, but different compare).

The same is true for the heap sort (i.e. std::make_heap followed by std::sort_heap) - but the amount of code generated can be considerably less, especially if the compare operator isn't completely trivial; I just did some tests, I was seeing 2k-3k bytes for a std::sort and 600-1000 bytes for heap sort of the same operation, on x86.

So if you tend to use a lot of sort operations on different types, and/or with different compare functions, it can be a good idea to use the heap sort for those which tend to operate on smaller N; for these the difference in algorithm efficiency won't hurt it much, and overall code bloat will be reduced.

I suspect that the heap implementation will tend to do more 'swaps' on a given problem, compared to std::sort, so if you are sorting a type which is more expensive to swap, it could be noticeably slower - for those cases maybe it's possible to sort an array of pointers instead.

like image 62
greggo Avatar answered Oct 12 '22 02:10

greggo


sort_heap assumes the input to be already in the form of a heap. This means it can theoretically work more efficiently than std::sort, since there are some constraints on the order of the input (unlike the std::sort, which has to work for all inputs).

As mentioned in the comments it is worth noting that those performance benefits are in no way ensured and obviously depend on the input data, so if performance matters there is really no way around profiling.

like image 45
Grizzly Avatar answered Oct 12 '22 02:10

Grizzly


The complexity guarantees are not actually the same.

std::sort requires O(log N) amount of memory on the stack. std::sort_heap requires O(1) amount of stack. This makes a big difference in an environment where stack space is constrained, for example in embedded applications (i.e., running on a microcontroller). Calling std::sort even on a few thousand element array can cause a stack overflow.

Incidentally, in embedded environments the internal storage is typically SRAM so you don't have to worry about cache locality where quicksort/introsort gain their performance advantage.

Therefore in a microcontroller environment it is advisable to write

std::make_heap(data.begin(), data.end());
std::sort_heap(data.begin(), data.end());

instead of

std::sort(data.begin(), data.end());
like image 36
user2501488 Avatar answered Oct 12 '22 01:10

user2501488