Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it a good idea to hash a Python class?

For example, suppose I do this:

>>> class foo(object):
...     pass
... 
>>> class bar(foo):
...     pass
... 
>>> some_dict = { foo : 'foo',
... bar : 'bar'}
>>> 
>>> some_dict[bar]
'bar'
>>> some_dict[foo]
'foo'
>>> hash(bar)
165007700
>>> id(bar)
165007700

Based on that, it looks like the class is getting hashed as its id number. Therefore, there shouldn't be any danger of worrying about, say, a bar hashing as either a foo or a bar or hash values changing if I mutate the class.

Is this behavior reliable, or are there any gotchas here?

like image 353
Jason Baker Avatar asked Aug 26 '09 15:08

Jason Baker


People also ask

Can you hash a class Python?

Object Hashes. An object hash is an integer number representing the value of the object and can be obtained using the hash() function if the object is hashable. To make a class hashable, it has to implement both the __hash__(self) method and the aforementioned __eq__(self, other) method.

Why hashing is used in Python?

Python hash() function is a built-in function and returns the hash value of an object if it has one. The hash value is an integer which is used to quickly compare dictionary keys while looking at a dictionary.

Can objects be hashed Python?

Python immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not. Objects which are instances of user-defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their id .

Is Python hash consistent?

Unfortunately, hash is only consistent within a single Python instance, rendering it useless for hosts to compare their respective hashes.


2 Answers

Yes, any object that doesn't implement a __hash__() function will return its id when hashed. From Python Language Reference: Data Model - Basic Customization:

User-defined classes have __cmp__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns id(x).

However, if you're looking to have a unique identifier, use id to be clear about your intent. A hash of an object should be a combination of the hashes of its components. See the above link for more details.

like image 153
Andrew Keeton Avatar answered Sep 28 '22 03:09

Andrew Keeton


Classes have default implementations of __eq__ and __hash__ that use id() to make comparisons and compute hash values, respectively. That is, they compare by identity. The primary rule for implementing __hash__ methods is that if two objects compare equal to each other, they must also have the same hash value. Hash values can be seen as just an optimization used by dicts and sets to do find equal objects faster. Consequently, if you change __eq__ to do a different kind of equality testing, you must also change your __hash__ implementation to agree with that choice.

Classes that use identity for comparisons can be freely mutated and used in dicts and sets because their identity never changes. Classes that implement __eq__ to compare by value and allow mutation of their values cannot be used in hash collections.

like image 29
Robert Kern Avatar answered Sep 28 '22 03:09

Robert Kern