Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the practical difference between std::nth_element and std::sort?

Tags:

c++

I've been looking at the std::nth_element algorithm which apparently:

Rearranges the elements in the range [first,last), in such a way that the element at the resulting nth position is the element that would be in that position in a sorted sequence, with none of the elements preceding it being greater and none of the elements following it smaller than it. Neither the elements preceding it nor the elements following it are guaranteed to be ordered.

However, with my compiler, running the following:

    vector<int> myvector;     srand(GetTickCount());      // set some values:     for ( int i = 0; i < 10; i++ )         myvector.push_back(rand());      // nth_element around the 4th element     nth_element (myvector.begin(), myvector.begin()+4, myvector.end());      // print results     for (auto it=myvector.begin(); it!=myvector.end(); ++it)         cout << " " << *it;      cout << endl; 

Always returns a completely sorted list of integers in exactly the same way as std::sort does. Am I missing something? What is this algorithm useful for?

EDIT: Ok the following example using a much larger set shows that there is quite a difference:

    vector<int> myvector;     srand(GetTickCount());      // set some values:     for ( int i = 0; i < RAND_MAX; i++ )         myvector.push_back(rand());      // nth_element around the 4th element     nth_element (myvector.begin(), myvector.begin()+rand(), myvector.end());      vector<int> copy = myvector;     std::sort(myvector.begin(), myvector.end());      cout << (myvector == copy ? "true" : "false") << endl; 
like image 433
Benj Avatar asked Apr 27 '12 14:04

Benj


People also ask

What is partial_ sort in c++?

C++ Algorithm partial_sort() function is used to rearrange the elements in the range[first, last), in such a way that the elements between the first and middle will be sorted and the elements between the middle and last will be in an unspecified order.

Is std :: sort stable?

As of September 2020, it appears that libc++ std::sort happens to be stable for all ranges of size less than 31, and libstdc++ std::sort happens to be stable for all ranges of size less than 17. (Do not rely on this little factoid in production!) To be clear: There's nothing wrong with this.

How is Nth_element implemented?

The nth_element function is typically implemented using Introselect, which brings the average complexity down to O(n).


2 Answers

It's perfectly valid for std::nth_element to sort the entire range for fulfilling the documented semantic - however, doing so will fail at meeting the required complexity (linear). The key point is that it may do so, but it doesn't have to.

This means that std::nth_element can bail out early - as soon as it can tell what the n'th element of your range is going to be, it can stop. For instance, for a range

[9,3,6,2,1,7,8,5,4,0] 

asking it to give you the fourth element may yield something like

[2,0,1,3,8,5,6,9,7,4] 

The list was partially sorted, just good enough to be able to tell that the fourth element in order will be 3.

Hence, if you want to answer 'which number is the fourth-smallest' or 'which are the four smallest' numbers then std::nth_element is your friend.

If you want to get the four smallest numbers in order you may want to consider using std::partial_sort.

like image 112
Frerich Raabe Avatar answered Oct 13 '22 23:10

Frerich Raabe


The implementation of std::nth_element looks as follows:

void _Nth_element(_RanIt _First, _RanIt _Nth, _RanIt _Last, _Pr _Pred) {     for (; _ISORT_MAX < _Last - _First; )         {   // divide and conquer, ordering partition containing Nth         pair<_RanIt, _RanIt> _Mid =             _Unguarded_partition(_First, _Last, _Pred);          if (_Mid.second <= _Nth)             _First = _Mid.second;         else if (_Mid.first <= _Nth)             return; // Nth inside fat pivot, done         else             _Last = _Mid.first;         }      _Insertion_sort(_First, _Last, _Pred);  // sort any remainder } 

where ISORT_MAX defined as 32.

So if your sequence is shoter than 32 elements it just performs InsertionSort on it. Therefore your short sequence is full sorted.

like image 26
epicfail Avatar answered Oct 13 '22 23:10

epicfail