Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any efficiency advantage in using minmax_element over min_element and max_element together?

Tags:

c++

max

min

std::minmax_element : returns a pair consisting of an iterator to the smallest element as the first element and an iterator to the greatest element as the second.

std::min_element : returns an iterator to the smallest element in the range [first, last).

std::max_element : returns an iterator to the largest element in the range [first, last).


Does std::minmax_element uses sorting of complete list to achieve this?

Is the overhead of processing returned pair from std::minmax_element worthy enough?

like image 644
Saurav Sahu Avatar asked Oct 27 '16 11:10

Saurav Sahu


4 Answers

You do not have to worry about std::minmax_element doing any sorting. It leaves the range in the exact way it was traversed. The reason it is more efficient is it can find both the max and min in a single pass where when looking for max and min separately you have to do two full traversals.

std::minmax_element has the complexity of max(floor(3/2(N−1)), 0) where as std::max_element and std::min_element each are max(N-1,0) so it is about 25% less operations using std::minmax_element

There is also a difference where std::minmax_element finds the last largest element while std::max_element finds the first largest.

So if you need to find the min and max of a range then you should use std::minmax_element. If you only need the min or max then you should use the specialized version. Dealing with the return from std::minmax_element will get even easier with the upcoming C++17 standard and structured bindings. You will be able to write

auto [min, max] = std::minmax_element(...);

and now the first element of the pair is stored in min and the second is stored in max.

like image 158
NathanOliver Avatar answered Nov 15 '22 08:11

NathanOliver


The other answers are good. I wanted to add a little about how minmax_element necessarily works, however, which also helps to explain why it (usually) works better than running min_element and max_element separately, and talk about some specific cases where it doesn't perform better.

If we think about a naive implementation, you would maintain a max-value and min-value (and their corresponding iterators) and simply iterate through the range, comparing each value with both the min-value and max-value and adjusting either as necessary. However, this would give you a total of 2N comparisons; while it might well perform better than iterating through the list twice (due to better use of locality), the specification calls for (roughly) 3/2 N comparisons. How is that possible then?

It works by processing pairs rather than individual items. Taking the first two items in the range (#0 and #1), we can compare them and assign the largest to max-value and the smallest to min-value. Then, we compare the next two items (#3 and #4) to decide which of them is larger; we compare the larger one to max-value, and the smaller one to min-value, and update max-value/min-value as necessary. We then repeat this process with each additional pair (#5 and #6, then #7 and #8 and so on).

So, each pair requires three comparisons - with each other, then the highest with the current maximum, and the lowest with the current minimum. This cuts down the number of comparisons required to 3/2 N!

As per comments below, however, it should be noted that this "improved" algorithm tends to produce worse performance on modern processors than the naive version when using a type (or comparator) where the comparison is cheap - notably, with a range over a vector<int> or similar: the comparison between the two elements of each pair has an unpredictable result, leading to branch prediction failure in the processor (though this is only true if the data is more-or-less randomly ordered); current compilers will not always convert branches into conditional transfers, as they potentially could. Also, it is harder for a compiler to vectorise the more complex algorithm.

In theory, I think, a C++ library implementation could provide an overload for the minmax_element function which used the naive algorithm for primitive (int etc) element types with the default comparator. While the standard mandates a bound to the number of comparisons, if the effect of those comparisons cannot be observed then the number actually computed is unimportant, so long as the time complexity is the same (which it is - O(N) in both cases). However, while this might give better performance with randomly ordered data, it might produce worse performance when the data is ordered.

With all the above in mind, a simple test case (below) shows an interesting result: for randomly ordered data, using min_element and max_element separately can in fact be slightly faster than using minmax_element. However, for sorted data, minmax_element is much faster than using min_element and max_element separately. On my system (Haswell processor) the below (compiled with gcc -O3 -std=c++11 -march=native, GCC version 5.4) a sample run shows 692 milliseconds for min/max separately and 848 for minmax combined. There is of course some variation between runs but these values seem typical.

Note that:

  • The difference in performance is small enough that it is unlikely to be a dominating factor in a real-world program;
  • The difference is dependent on compiler optimisation; in the future, the results may well reverse;
  • for more complex data types (or rather for more complex comparators) the results will likely reverse, since in that case the fewer comparisons will likely be a significant improvement.
  • when the sample data is ordered instead of random (replace v.push_back(r(gen)) with v.push_back(i) in the program below) the performance is very different: for separate min/max, it's around 728 milliseconds, whereas for combined minmax, it goes down to 246 milliseconds.

The code:

#include <iostream>
#include <vector>
#include <algorithm>
#include <random>
#include <chrono>

constexpr int numEls = 100000000;

void recresult(std::vector<int> *v, int min, int max)
{
   // Make sure the compiler doesn't optimize out the values: 
   __asm__ volatile (
       ""
       :
       : "rm"(v), "rm"(min), "rm"(max)
   );
}

int main(int argc, char **argv)
{
    using namespace std;

    std::mt19937 gen(0);
    uniform_int_distribution<> r(0, 100000);


    vector<int> v;
    for (int i = 0; i < numEls; i++) {
        v.push_back(r(gen));
    }

    // run once for warmup
    int min = *min_element(v.begin(), v.end());
    int max = *max_element(v.begin(), v.end());
    recresult(&v, min, max);

    // min/max separately:
    {
        auto starttime = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < 5; i++) {
        int min = *min_element(v.begin(), v.end());
            int max = *max_element(v.begin(), v.end());
            recresult(&v, min, max);
        }
        auto endtime = std::chrono::high_resolution_clock::now();
        auto millis = std::chrono::duration_cast<std::chrono::milliseconds>(endtime - starttime).count();

        cout << "min/max element: " << millis << " milliseconds." << endl;
    }

    // run once for warmup
    auto minmaxi = minmax_element(v.begin(), v.end());
    recresult(&v, *(minmaxi.first), *(minmaxi.second));

    // minmax together:
    {
        auto starttime = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < 5; i++) {
        minmaxi = minmax_element(v.begin(), v.end());
        recresult(&v, *(minmaxi.first), *(minmaxi.second));
        }
        auto endtime = std::chrono::high_resolution_clock::now();
        auto millis = std::chrono::duration_cast<std::chrono::milliseconds>(endtime - starttime).count();

        cout << "minmax element: " << millis << " milliseconds." << endl;
    }

    return 0;
}
like image 36
davmac Avatar answered Nov 15 '22 08:11

davmac


Yes. You're iterating over the range only once instead of doing it twice.

like image 33
krzaq Avatar answered Nov 15 '22 07:11

krzaq


std::minmax_element complexity:

At most max(floor(3/2(N−1)), 0) applications of the predicate, where N = std::distance(first, last).

std::min_element complexity (same as max_element):

Exactly max(N-1,0) comparisons, where N = std::distance(first, last).

Ignoring max and floor, we get:

(N-1) * 2 vs 3/2 (N-1)

So by using minmax_element you get 3/4 of the comparisons you would need by using max_element + min_element, or better.

minmax_element uses the transitivity of the < operator, it knows that if it is updating the minimum it does not need to compare for the maximum by comparing two elements at once, i.e. if a < b then we only need to check min(a, current_min) and max(b, current_max), and vice-versa.

Also worth noting:

This algorithm is different from std::make_pair(std::min_element(), std::max_element()), not only in efficiency, but also in that this algorithm finds the last biggest element while std::max_element finds the first biggest element.

like image 21
sbabbi Avatar answered Nov 15 '22 08:11

sbabbi