Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing polymorphic types in c++20

I have code that is somewhere between c++17 and c++20. Specifically, we have c++20 enabled on GCC-9 and clang-9, where it is only partially implemented.

In code we have quite big hierarchy of polymorphic types like this:

struct Identifier {
    virtual bool operator==(const Identifier&other) const = 0;
};

struct UserIdentifier : public Identifier {
    int userId =0;
    bool operator==(const Identifier&other) const override {
        const UserIdentifier *otherUser = dynamic_cast<const UserIdentifier*>(&other);
        return otherUser && otherUser->userId == userId;
    }
};

struct MachineIdentifier : public Identifier {
    int machineId =0;
    bool operator==(const Identifier&other) const override {
        const MachineIdentifier *otherMachine = dynamic_cast<const MachineIdentifier*>(&other);
        return otherMachine && otherMachine->machineId == machineId;
    }
};

int main() {
    UserIdentifier user;
    MachineIdentifier machine;
    return user==machine? 1: 0;
}

https://godbolt.org/z/er4fsK

We are now migrating to GCC-10 and clang-10, but because of reasons we still need to work on versions 9 (well, at least clang-9 as this is what android NDK currently has).

The above code stops compiling because new rules about comparison operators are implemented. Reversible operator== causes ambiguities. I can't use a spaceship operator because it is not implemented in versions 9. But I omitted this from the example - I assume that whatever works with == will work with other operators.

So: What is the recommended approach to implementing comparison operators in c++20 with polymorphic types?

like image 831
MateuszL Avatar asked Oct 26 '20 21:10

MateuszL


3 Answers

As an intermediate solution you could re-factor your polymorphic equality operator== to a non-virtual operator== defined in the base class, which polymorphically dispatches to a non-operator virtual member function:

struct Identifier {    
    bool operator==(const Identifier& other) const {
        return isEqual(other);
    }
private:
    virtual bool isEqual(const Identifier& other) const = 0;
};

// Note: do not derive this class further (less dyncasts may logically fail).
struct UserIdentifier final : public Identifier {
    int userId = 0;
private:
    virtual bool isEqual(const Identifier& other) const override {
        const UserIdentifier *otherUser = dynamic_cast<const UserIdentifier*>(&other);
        return otherUser && otherUser->userId == userId;
    }
};

// Note: do not derive this class further (less dyncasts may logically fail).
struct MachineIdentifier final : public Identifier {
    int machineId = 0;
private:
    virtual bool isEqual(const Identifier& other) const override {
        const MachineIdentifier *otherMachine = dynamic_cast<const MachineIdentifier*>(&other);
        return otherMachine && otherMachine->machineId == machineId;
    }
};

There will now no longer be an ambiguity as dispatch on the isEqual virtual member function will always be done on the left hand side argument to operator==.

const bool result = (user == machine);  // user.isEqual(machine);
like image 185
dfrib Avatar answered Oct 08 '22 23:10

dfrib


OK, I see it wasn't mentioned in the answer given by @dfrib, so I'll extend that answer to show it.

You could add an abstract (pure virtual) function in the Identifier structure, which returns its "identity".

Then, in each structure which extends the Identifier structure, you could call that function instead of dynamically casting the input object and check that its type is matching the this object.

Of course, you will have to ensure complete distinguish between the set of identities of each structure. In other words, any two sets of identities must not share any common values (i.e., the two sets must be disjoint).

This will allow you to completely get rid of RTTI, which is pretty much the complete opposite of polymorphism IMO, and also yields an additional runtime impact on top of that.

Here is the extension of that answer:

struct Identifier {    
    bool operator==(const Identifier& other) const {
        return getVal() == other.getVal();
    }
private:
    virtual int getVal() const = 0;
};

struct UserIdentifier : public Identifier {
private:
    int userId = 0;
    virtual int getVal() const override {
        return userId;
    }
};

struct MachineIdentifier : public Identifier {
private:
    int machineId = 100;
    virtual int getVal() const override {
        return machineId;
    }
};

If you want to support a structure with identifiers other some type other than int, then you can extend this solution to use templates.

Alternatively to enforcing a different set of identities for each structure, you could add a type field, and ensure that only this field is unique across different structures.

In essence, those types would be the equivalent of the dynamic_cast check, which compares between the pointer of the input object's V-table and the pointer of the input structure's V-table (hence my opinion of this approach being the complete opposite of polymorphism).

Here is the revised answer:

struct Identifier {    
    bool operator==(const Identifier& other) const {
        return getType() == other.getType() && getVal() == other.getVal();
    }
private:
    virtual int getType() const = 0;
    virtual int getVal() const = 0;
};

struct UserIdentifier : public Identifier {
private:
    int userId = 0;
    virtual int getType() const override {
        return 1;
    virtual int getVal() const override {
        return userId;
    }
};

struct MachineIdentifier : public Identifier {
private:
    int machineId = 0;
    virtual int getType() const override {
        return 2;
    virtual int getVal() const override {
        return machineId;
    }
};
like image 29
goodvibration Avatar answered Oct 08 '22 21:10

goodvibration


This does not look like a problem of polymorphism. Actually, I think that there is any polymorphism at all is a symptom of a data model error.

If you have values that identify machines, and values that identify users, and these identifiers are not interchangeable¹, they should not share a supertype. The property of "being an identifier" is a fact about how the type is used in the data model to identify values of another type. A MachineIdentifier is an identifier because it identifies a machine; a UserIdentifier is an identifier because it identifies a user. But an Identifier is in fact not an identifier, because it doesn't identify anything! It is a broken abstraction.

A more intuitive way to put this might be: the type is the only thing that makes an identifier meaningful. You cannot do anything with a bare Identifier, unless you first downcast it to MachineIdentifier or UserIdentifier. So having an Identifier class is most likely wrong, and comparing a MachineIdentifier to a UserIdentifier is a type error that should be detected by the compiler.

It seems to me the most likely reason Identifier exists is because someone realized that there was common code between MachineIdentifier and UserIdentifier, and leapt to the conclusion that the common behavior should be extracted to an Identifier base type, with the specific types inheriting from it. This is an understandable mistake for anyone who has learned in school that "inheritance enables code reuse" and has not yet realized that there are other kinds of code reuse.

What should they have written instead? How about a template? Template instantiations are not subtypes of the template or of each other. If you have types Machine and User that these identifiers represent, you might try writing a template Identifier struct and specializing it, instead of subclassing it:

template <typename T>
struct Identifier {};

template <>
struct Identifier<User> {
  int userId = 0;
  bool operator==(const Identifier<User> &other) const {
    return other.userId == userId;
  }
};

template <>
struct Identifier<Machine> {
  int machineId = 0;
  bool operator==(const Identifier<Machine> &other) const {
    return other.machineId == machineId;
  }
};

This probably makes the most sense when you can move all the data and behavior into the template and thus not need to specialize. Otherwise, this is not necessarily the best option because you cannot specify that Identifier instantiations must implement operator==. I think there may be a way to achieve that, or something similar, using C++20 concepts, but instead, let's combine templates with inheritance to get some advantages of both:

template <typename Id>
struct Identifier {
  virtual bool operator==(const Id &other) const = 0;
};

struct UserIdentifier : public Identifier<UserIdentifier> {
  int userId = 0;
  bool operator==(const UserIdentifier &other) const override {
    return other.userId == userId;
  }
};

struct MachineIdentifier : public Identifier<MachineIdentifier> {
  int machineId = 0;
  bool operator==(const MachineIdentifier &other) const override {
    return other.machineId == machineId;
  }
};

Now, comparing a MachineIdentifier to a UserIdentifier is a compile time error.

This technique is called the curiously recurring template pattern (also see crtp). It is somewhat baffling when you first come across it, but what it gives you is the ability to refer to the specific subclass type in the superclass (in this example, as Id). It might also be a good option for you because, compared to most other options, it requires relatively few changes to code that already correctly uses MachineIdentifier and UserIdentifier.


¹ If the identifiers are interchangeable, then most of this answer (and most of the other answers) probably does not apply. But if that is the case, it should also be possible to compare them without downcasting.

like image 1
trent Avatar answered Oct 08 '22 23:10

trent