Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the breaking changes caused by rewritten comparison operators?

There are some new rules about rewritten comparison operators in C++20, and I'm trying to understand how they work. I've run into the following program:

struct B {};

struct A
{
    bool operator==(B const&);  // #1
};

bool operator==(B const&, A const&);  // #2

int main()
{
  B{} == A{};  // C++17: calls #2
               // C++20: calls #1
}

which actually breaks existing code. I'm a little surprised by this; #2 actually still looks better to me :p

So how do these new rules change the meaning of existing code?

like image 244
cigien Avatar asked Sep 30 '20 03:09

cigien


People also ask

What are comparison operators used for?

Comparison operators can compare numbers or strings and perform evaluations. Expressions that use comparison operators do not return a number value as do arithmetic expressions. Comparison expressions return either 1 , which represents true, or 0 , which represents false.

Why would you need to overload a comparison operator?

Operator overloading is a crucial concept in C++ that lets you achieve the functionality of the built-in operators while working with user-defined data types. Comparison operators in C++ are the ones that are there to compare two values with each other such as “==”, “!=


2 Answers

That particular aspect is a simple form of rewriting, reversing the operands. The primary operators == and <=> can be reversed, the secondaries !=, <, >, <=, and >=, can be rewritten in terms of the primaries.

The reversing aspect can be illustrated with a relatively simple example.

If you don't have a specific B::operator==(A) to handle b == a, you can use the reverse to do it instead: A::operator==(B). This makes sense because equality is a bi-directional relationship: (a == b) => (b == a).

Rewriting for secondary operators, on the other hand, involves using different operators. Consider a > b. If you cannot locate a function to do that directly, such as A::operator>(B), the language will go looking for things like A::operator<=>(B) then simply calculating the result from that.

That's a simplistic view of the process but it's one that most of my students seem to understand. If you want more details, it's covered in the [over.match.oper] section of C++20, part of overload resolution (@ is a placeholder for the operator):

For the relational and equality operators, the rewritten candidates include all member, non-member, and built-in candidates for the operator <=> for which the rewritten expression (x <=> y) @ 0 is well-formed using that operator<=>.

For the relational, equality, and three-way comparison operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each member, non-member, and built-in candidate for the operator <=> for which the rewritten expression 0 @ (y <=> x) is well-formed using that operator<=>.


Hence gone are the days of having to provide a real operator== and operator<, then boiler-plating:

operator!=      as      !  operator==
operator>       as      ! (operator== || operator<)
operator<=      as         operator== || operator<
operator>=      as      !  operator<

Don't complain if I've gotten one or more of those wrong, that just illustrates my point on how much better C++20 is, since you now only have to provide a minimal set (most likely just operator<=> plus whatever else you want for efficiency) and let the compiler look after it :-)


The question as to why one is being selected over the other can be discerned with this code:

#include <iostream>

struct B {};
struct A {
    bool operator==(B const&) { std::cout << "1\n"; return true; }
};
bool operator==(B const&, A const&) { std::cout << "2\n"; return true; }

int main() {
  auto b = B{}; auto a = A{};

           b ==          a;  // outputs: 1
  (const B)b ==          a;  //          1
           b == (const A)a;  //          2
  (const B)b == (const A)a;  //          2
}

The output of that indicates that it's the const-ness of a deciding which is the better candidate.

As an aside, you may want to have a look at this article, which offers a more in-depth look.

like image 198
paxdiablo Avatar answered Oct 18 '22 20:10

paxdiablo


From a non-language-lawyer sense, it works like this. C++20 requires that operator== compute whether the two objects are equal. The concept of equality is commutative: if A == B, then B == A. As such, if there are two operator== functions that could be called by C++20's argument reversal rules, then your code should behave identically either way.

Basically, what C++20 is saying is that if it matters which one gets called, you're defining "equality" incorrectly.


So let's get into the details. And by "the details", I mean the most horrifying chapter of the standard: function overload resolution.

[over.match.oper]/3 defines the mechanism by which the candidate function set for an operator overload is built. C++20 adds to this by introducing "rewritten candidates": a set of candidate functions discovered by rewriting the expression in a way that C++20 deems to be logically equivalent. This only applies to the relational and in/equality operators.

The set is built in accord with the following:

  • For the relational ([expr.rel]) operators, the rewritten candidates include all non-rewritten candidates for the expression x <=> y.
  • For the relational ([expr.rel]) and three-way comparison ([expr.spaceship]) operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each non-rewritten candidate for the expression y <=> x.
  • For the != operator ([expr.eq]), the rewritten candidates include all non-rewritten candidates for the expression x == y.
  • For the equality operators, the rewritten candidates also include a synthesized candidate, with the order of the two parameters reversed, for each non-rewritten candidate for the expression y == x.
  • For all other operators, the rewritten candidate set is empty.

Note the particular concept of a "synthesized candidate". This is standard-speak for "reversing the arguments".

The rest of the section details what it means if one of the rewritten candidates gets chosen (aka: how to synthesize the call). To find which candidate gets chosen, we must delve into the most horrifying part of the most horrifying chapter of the C++ standard:

Best viable function matching.

What matters here is this statement:

a viable function F1 is defined to be a better function than another viable function F2 if for all arguments i, ICSi(F1) is not a worse conversion sequence than ICSi(F2), and then

And that matters... because of this. Literally.

By the rules of [over.ics.scs], an identity conversion is a better match than a conversion that adds a qualifier.

A{} is a prvalue, and... it's not const. Neither is the this parameter to the member function. So it's an identity conversion, which is a better conversion sequence than one that goes to the const A& of the non-member function.

Yes, there is a rule further down that explicitly makes rewritten functions in the candidate list less viable. But it doesn't matter, because the rewritten call is a better match on function arguments alone.

If you use explicit variables and declare one like this A const a{};, then [over.match.best]/2.8 gets involved and de-prioritizes the rewritten version. As seen here. Similarly, if you make the member function const, you also get consistent behavior.

like image 45
Nicol Bolas Avatar answered Oct 18 '22 21:10

Nicol Bolas