Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python __hash__ for equal value objects

Tags:

python

hash

set

Say I have some Person entities and I want to know if one is in a list:

person in people?

I don't care what the 'object's ID' is, just that their properties are the same. So I put this in my base class:

# value comparison only
def __eq__(self, other):
    return (isinstance(other, self.__class__) and self.__dict__ == other.__dict__)

def __ne__(self, other):
    return not self.__eq__(other)

But to be able to test for equality in sets, I also need to define hash So...

# sets use __hash__ for equality comparison
def __hash__(self):
    return (
        self.PersonID,
        self.FirstName,
        self.LastName,
        self.etc_etc...
    ).__hash__()

The problem is I don't want to list every property, and I don't want to modify the hash function every time the properties change.

So is it okay to do this?

# sets use __hash__ for equality comparison
def __hash__(self):
    values = tuple(self.__dict__.values())
    return hash(values)

Is this sane, and not toooo much of a performance penalty? In a web-app situation.

Thanks muchly.

like image 340
Barry Avatar asked Aug 26 '13 03:08

Barry


Video Answer


1 Answers

The unordered nature of dictionaries means that tuple(self.__dict__.values()) is prone to producing different results if the dicts happen to be ordered differently (which could happen, for example, if one had its attributes assigned in a different order).

Because your values are hashable, you could try this instead:

return hash(frozenset(self.__dict__.iteritems()))

Alternatively, note that __hash__ does not need to take everything into account because __eq__ will still be used to verify equality when the hash values compare equal. Therefore, you can probably get away with

return hash(self.PersonID)

assuming PersonID is relatively unique across instances.

like image 123
nneonneo Avatar answered Oct 06 '22 21:10

nneonneo