When implementing a class with multiple properties (like in the toy example below), what is the best way to handle hashing?
I guess that the __eq__
and __hash__
should be consistent, but how to implement a proper hash function that is capable of handling all the properties?
class AClass: def __init__(self): self.a = None self.b = None def __eq__(self, other): return other and self.a == other.a and self.b == other.b def __ne__(self, other): return not self.__eq__(other) def __hash__(self): return hash((self.a, self.b))
I read on this question that tuples are hashable, so I was wondering if something like the example above was sensible. Is it?
The hash() function accepts an object and returns the hash value as an integer. When you pass an object to the hash() function, Python will execute the __hash__ special method of the object. By default, the __hash__ uses the object's identity and the __eq__ returns True if two objects are the same.
it can have duplicate values but not keys. Show activity on this post. If you wanted to associate multiple values with a key, you could place a reference to an array (or hash) at that key, and add the value to that array (or hash).
Hashing AlgorithmsA hash is supposed to be repeatable, that means each time we apply it to the same data we should get the same hash value out. This requires that we create a hashing algorithm or function: Take a look at this (if you've forgotten how MOD works, go check it out!)
__hash__
should return the same value for objects that are equal. It also shouldn't change over the lifetime of the object; generally you only implement it for immutable objects.
A trivial implementation would be to just return 0
. This is always correct, but performs badly.
Your solution, returning the hash of a tuple of properties, is good. But note that you don't need to list all properties that you compare in __eq__
in the tuple. If some property usually has the same value for inequal objects, just leave it out. Don't make the hash computation any more expensive than it needs to be.
Edit: I would recommend against using xor to mix hashes in general. When two different properties have the same value, they will have the same hash, and with xor these will cancel eachother out. Tuples use a more complex calculation to mix hashes, see tuplehash
in tupleobject.c
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With