I come from Java where even mutable objects can be "hashable".
And I am playing with Python 3.x these days just for fun.
Here is the definition of hashable in Python (from the Python glossary).
hashable
An object is hashable if it has a hash value which never changes during its lifetime (it needs a
__hash__()
method), and can be compared to other objects (it needs an__eq__()
method). Hashable objects which compare equal must have the same hash value.Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.
All of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not. Objects which are instances of user-defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their
id()
.
I read it and I am thinking... Still... Why didn't they make in Python even mutable objects hashable? E.g. using the same default hashing mechanism as for user-defined objects i.e. as described by the last 2 sentences above.
Objects which are instances of user-defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their id().
This feels somewhat weird... so user-defined mutable objects are hashable (via this default hashing mechanism) but built-in mutable objects are not hashable. Doesn't this just complicate things? I don't see what benefits it brings, could someone explain?
While none of the built-in mutable objects are hashable, it is possible to make a mutable object with a hash value that's not mutable. It's common for only a portion of the object to represent its identity, while the rest of the object contains properties that are free to change.
All immutable built-in objects in Python are hashable like tuples while the mutable containers like lists and dictionaries are not hashable. Objects which are instances of the user-defined class are hashable by default, they all compare unequal, and their hash value is their id().
A hashable Python object is any object that has a hash value — an integer identificator of that object which never changes during its lifetime. To check if an object is hashable or not and find out its hash value (if it's hashable), we use the hash() function on this object: print(hash(3.14))Output: 322818021289917443.
This error occurs when trying to hash a list, which is an unhashable object. For example, using a list as a key in a Python dictionary will cause this error since dictionaries only accept hashable data types as a key. The standard way to solve this issue is to cast a list to a tuple, which is a hashable data type.
In Python, mutable objects can be hashable, but it is generally not a good idea, because generally speaking, the equality is defined in terms of these mutable attributes, and this can lead to all sorts of crazy behavhior.
If built-in mutable objects are hashed based on identity, like the default hashing mechanism for user-defined objects, then their hash would be inconsistent with their equality. And that is absolutely a problem. However, user-defined objects by default compare and hash based on identity, so it isn't as bad of a situation, although, this set of affairs isn't very useful.
Note, if you implement __eq__
in a user-defined class, the __hash__
is set to None
, making the class unhashable.
So, from the Python 3 data model documentation:
User-defined classes have
__eq__()
and__hash__()
methods by default; with them, all objects compare unequal (except with themselves) andx.__hash__()
returns an appropriate value such thatx == y
implies both thatx is y
andhash(x) == hash(y)
.A class that overrides
__eq__()
and does not define__hash__()
will have its__hash__()
implicitly set toNone
. When the__hash__()
method of a class isNone
, instances of the class will raise an appropriate TypeError when a program attempts to retrieve their hash value, and will also be correctly identified as unhashable when checkingisinstance(obj, collections.abc.Hashable)
.
Calculating a hash value is like giving an identity to an object which simplify the comparison of objects. The comparison by hash value is generally faster than the comparison by value: for an object, you compare its attributes, for a collection, you compare its items, recursively…
If an object is mutable you need to calculate its hash value again after each changes. If this object was compared equal with another one, after a change it becomes unequal. So, mutable objects must be compared by value, not by hash. It’s a non-send to compare by hash values for mutable objects.
Edit: Java HashCode
Typically, hashCode() just returns the object's address in memory if you don't override it.
See the reference about the hashCode
function.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
So, the Java hashCode
function works the same as the default Python __hash__
function.
In Java, if you use a mutable object in a HashSet
, for instance, the HashSet
isn’t working properly. Because the hashCode
depends of the state of the object it can no longer be retrieved properly, so the check for containment fails.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With