Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I legally use a struct with overloaded operator() as Compare for std::upper_bound?

I have structs like this (types simplified to carry over the point), living in a std::vector:

struct Region {
    int first;
    int count;
    struct Metadata region_metadata;
};

In the vector, they are ordered by first. If you add first and count, you get the first of the next region; so basically this vector of structs describes metadata for contiguous ranges of numbers.

Now given an integer, I want to look up the metadata. As the regions are sorted, I can use std::upper_bound. I implemented it this way:

struct Comp
{
    inline bool operator()(const Region &region, int index) const
    {
        return region.first < index;
    }

    inline bool operator()(int index, const Region &region) const
    {
        return index < region.first;
    }
};

This works, when calling std::upper_bound with:

auto iter = std::upper_bound(m_regions.begin(),
                             m_regions.end(),
                             index,
                             Comp());

Now this happens to work, because upper_bound can internally pick the overload which matches its requirements, as it calls both Comp()(Region, int) and Comp()(int, Region) (which is the reason why a [](const Region &reg, int index){…} would not work).

I actually came up with the solution by tracing down the error messages when using the lambda I mentioned before. The docs for std::upper_bound at cppreference.com write about the fourth argument:

comparison function object (i.e. an object that satisfies the requirements of Compare) which returns ​true if the first argument is less than the second.

The signature of the comparison function should be equivalent to the following:

bool cmp(const Type1 &a, const Type2 &b);

The signature does not need to have const &, but the function object must not modify the objects passed to it. The types Type1 and Type2 must be such that an object of type T can be implicitly converted to both Type1 and Type2, and an object of type ForwardIt can be dereferenced and then implicitly converted to both Type1 and Type2.

The type Type1 must be such that an object of type T can be implicitly converted to Type1. The type Type2 must be such that an object of type ForwardIt can be dereferenced and then implicitly converted to Type2. ​

(cppreference has been fixed since I posted this question, thanks @T.C.)

Here, T is the third argument to std::upper_bound and ForwardIt is the type of the first two arguments. This quote does not talk about the function object actually being a struct which overloads its operator() to cover the "forward" and "reverse" situations.

So in Rules-As-Written, is this legal, or is it an artifact of my specific compiler/standard library combination (g++ 5.3.1)?

I am interested in answers specific to C++14 or C++17.

Full example:

#include <algorithm>
#include <iostream>
#include <vector>


struct Region {
    Region(int first, int count, int n):
        first(first),
        count(count),
        n(n)
    {

    }

    int first;
    int count;
    int n; // think struct Metadata
};


struct Comp
{
    inline bool operator()(const Region &region, int index) const
    {
        return region.first < index;
    }

    inline bool operator()(int index, const Region &region) const
    {
        return index < region.first;
    }
};


int main() {
    std::vector<Region> regions;
    regions.emplace_back(0, 10, 1);
    regions.emplace_back(10, 10, 2);
    regions.emplace_back(20, 10, 3);

    const int lookup = 10;

    auto iter = std::upper_bound(
        regions.begin(),
        regions.end(),
        lookup,
        Comp());

    // yes, I omitted error checking here, with error being iter == regions.begin()
    std::cout << lookup << " is in region with n = " << (iter-1)->n << std::endl;
}
like image 698
Jonas Schäfer Avatar asked Mar 15 '16 09:03

Jonas Schäfer


1 Answers

Now this happens to work, because upper_bound can internally pick the overload which matches its requirements, as it calls both Comp()(Region, int) and Comp()(int, Region) (which is the reason why a [](const Region &reg, int index){…} would not work).

No, upper_bound only calls Comp's second overload. That's precisely what your (fixed - thanks @T.C.!) quote is about: The first argument to the comparator is always the third argument to upper_bound. The lambda's parameters should be swapped.

Overloading operator() inside a comparator for upper_bound/lower_bound is inherently meaningless, because only one overload will ever be picked by those algorithms.

operator() should be overloaded as you've shown when working with equal_range, and it is legal to do so since the internal details of a comparator (or any functor for that matter) are irrelevant to the library: You solely need to make sure that the ordering is strict (i.e. correct semantics) and the overloads unambiguous.

like image 197
Columbo Avatar answered Nov 01 '22 04:11

Columbo