is it a bad idea to implement __hash__
like so?
class XYZ:
def __init__(self):
self.val = None
def __hash__(self):
return id(self)
Am i setting up something potentially disastrous?
Unequal objects may have the same hash values. Equal objects need to have the same id values. Whenever obj1 is obj2 is called, the id values of both objects is compared, not their hash values.
Declaring and Instantiating a Hash ObjectYou declare a hash object using the DECLARE statement . After you declare the new hash object, use the _NEW_ operator to instantiate the object. For example: declare hash myhash; myhash = _new_ hash();
The hash() method returns the hash value of an object if it has one. Hash values are just integers that are used to compare dictionary keys during a dictionary look quickly.
What is Hash Method in Python? Hash method in Python is a module that is used to return the hash value of an object. In programming, the hash method is used to return integer values that are used to compare dictionary keys using a dictionary look up feature.
The __hash__
method has to satisfy the following requirement in order to work:
Forall x, y such that x == y
, then hash(x) == hash(y)
.
In your case your class does not implement __eq__
which means that x == y
if and only if id(x) == id(y)
, and thus your hash implementation satisfy the above property.
Note however that if you do implement __eq__
then this implementation will likely fail.
Also: there is a difference between having a "valid" __hash__
and having a good hash. For example the following is a valid __hash__
definition for any class:
def __hash__(self):
return 1
A good hash should try to distribute uniformly the objects as to avoid collisions as much as possible. Usually this requires a more complex definition.
I'd avoid trying to come up with formulas and instead rely on python built-in hash
function.
For example if your class has fields a
, b
and c
then I'd use something like this as __hash__
:
def __hash__(self):
return hash((self.a, self.b, self.c))
The definition of hash
for tuples should be good enough for the average case.
Finally: you should not define __hash__
in classes that are mutable (in the fields used for equality). That's because modifying the instances will change their hash and this will break things.
It's either pointless or wrong, depending on the rest of the class.
If your objects use the default identity-based ==
, then defining this __hash__
is pointless. The default __hash__
is also identity-based, but faster, and tweaked to avoid always having the low bits set to 0. Using the default __hash__
would be simpler and more efficient.
If you objects don't use the default identity-based ==
, then your __hash__
is wrong, because it's going to be inconsistent with ==
. If your objects are immutable, you should implement __hash__
in a way that would be consistent with ==
; if your objects are mutable, you should not implement __hash__
at all (and set __hash__ = None
if you need to support Python 2).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With