Comparing instances of a dict subclass



I have subclassed dict to add an extra method (so no overriding).

Now, I try to compare two of those subclasses, and I get something weird :

>>> d1.items() == d2.items()
>>> d1.values() == d2.values()
>>> d1.keys() == d2.keys()
>>> d1 == d2


That's damn weird ... I don't understand at all ! Anybody with an insight on how the dict.eq is implemented ?

Following is all the code :

# ------ Bellow is my dict subclass (with no overriding) :

class ClassSetDict(dict):

    def subsetget(self, klass, default=None):
        class_sets = set(filter(lambda cs: klass <= cs, self))
        # Eliminate supersets
        for cs1 in class_sets.copy():
            for cs2 in class_sets.copy():
                if cs1 <= cs2 and not cs1 is cs2:
            best_match = list(class_sets)[0]
        except IndexError:
            return default
        return self[best_match]

# ------  Then an implementation of class sets

class ClassSet(object):
    # Set of classes, allowing to easily calculate inclusions
    # with comparison operators : `a < B` <=> "A strictly included in B"

    def __init__(self, klass):
        self.klass = klass

    def __ne__(self, other):
        return not self == other

    def __gt__(self, other):
        other = self._default_to_singleton(other)
        return not self == other and other < self

    def __le__(self, other):
        return self < other or self == other

    def __ge__(self, other):
        return self > other or self == other

    def _default_to_singleton(self, klass):
        if not isinstance(klass, ClassSet):
            return Singleton(klass)
            return klass

class Singleton(ClassSet):

    def __eq__(self, other):
        other = self._default_to_singleton(other)
        return self.klass == other.klass

    def __lt__(self, other):
        if isinstance(other, AllSubSetsOf):
            return issubclass(self.klass, other.klass)
            return False

class AllSubSetsOf(ClassSet):

    def __eq__(self, other):
        if isinstance(other, AllSubSetsOf):
            return self.klass == other.klass
            return False

    def __lt__(self, other):
        if isinstance(other, AllSubSetsOf):
            return issubclass(self.klass, other.klass) and not other == self
            return False

# ------ and finally the 2 dicts that don't want to be equal !!!

d1 = ClassSetDict({AllSubSetsOf(object): (int,)})
d2 = ClassSetDict({AllSubSetsOf(object): (int,)})
the problem you're seing has nothing at all to do with subclassing dict. in fact this behavior can be seen using a regular dict. The problem is how you have defined the keys you're using. A simple class like:

>>> class Foo(object):
...     def __init__(self, value):
...         self.value = value
...     def __eq__(self, other):
...         return self.value == other.value

Is enough to demonstrate the problem:

>>> f1 = Foo(5)
>>> f2 = Foo(5)
>>> f1 == f2
>>> d1 = {f1: 6}
>>> d2 = {f2: 6}
>>> d1.items() == d2.items()
>>> d1 == d2

What's missing is that you forgot to define __hash__. Every time you change the equality semantics of a class, you should make sure that the __hash__ method agrees with it: when two objects are equal, they must have equal hashes. dict behavior depends strongly on the hash value of keys.

When you inherit from object, you automatically get both __eq__ and __hash__, the former compares object identity, and the latter returns the address of the object (so they agree), but when you change __eq__, you're still seeing the old __hash__, which no longer agrees and dict gets lost.

Simply provide a __hash__ method that in a stable way combines the hash values of its attributes.

>>> class Bar(object):
...     def __init__(self, value):
...         self.value = value
...     def __eq__(self, other):
...         return self.value == other.value
...     def __hash__(self):
...         return hash((Bar, self.value))
>>> b1 = Bar(5)
>>> b2 = Bar(5)
>>> {b1: 6} == {b2: 6}

When using __hash__ in this way, it's also a good idea to make sure that the attributes do not (or better, cannot) change after the object is created. If the hash value changes while collected in a dict, the key will be "lost", and all sorts of weird things can happen (even weirder than the issue you initially asked about)

