Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does std::nth_element return sorted vectors for input vectors with N < 33 elements?

I am using std::nth_element to get a (roughly correct) value for a percentile of a vector, like so:

double percentile(std::vector<double> &vectorIn, double percent)
{
    std::nth_element(vectorIn.begin(), vectorIn.begin() + (percent*vectorIn.size())/100, vectorIn.end());

    return vectorIn[(percent*vectorIn.size())/100];
}  

I noticed that for vectorIn lengths of up to 32 elements, the vector gets completely sorted. Starting from 33 elements it is never sorted (as expected).

Not sure whether this matters but the function is in a "(Matlab-)mex c++ code" that is compiled via Matlab using the "Microsoft Windows SDK 7.1 (C++)".

EDIT:

Also see the following histogram of the lengths of the longest sorted blocks in 1e5 vectors passed to the function (vectors contained 1e4 random elements and a random percentile was calculated). Note the peak at very small values.

Histogram of lengths of longes sorted blocks

like image 291
stack_horst Avatar asked Feb 16 '15 19:02

stack_horst


1 Answers

This will vary from standard library implementation to standard library implementation (and may vary based on other factors) but in general terms:

  • std::nth_element is allowed to rearrange the input container as it sees fit, provided that the nth_element is in position n, and the container is partitioned at position n.

  • For small containers, it is usually faster to do a full insertion-sort than a quickselect, even though that is not scalable.

Since standard library authors will usually opt for the fastest solution, most nth_element implementations (and, for that matter, sort implementations) use customized algorithms for small inputs (or for small segments at the bottom of the recursion), which may sort the container more aggressively than seems necessary. For vectors of scalar values, insertion sort is extremely fast, since it takes maximum advantage of the cache. With streaming extensions, it is possible to speed it up even more by doing parallel compares.

By the way, you can save a tiny amount of calculation by only computing the threshold iterator once, which might be more readable:

double percentile(std::vector<double> &vectorIn, double percent)
{
    auto nth = vectorIn.begin() + (percent*vectorIn.size())/100;
    std::nth_element(vectorIn.begin(), nth, vectorIn.end());
    return *nth;
}
like image 190
rici Avatar answered Sep 21 '22 08:09

rici