Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why make lists unhashable?

Tags:

python

list

hash

A common issue on SO is removing duplicates from a list of lists. Since lists are unhashable, set([[1, 2], [3, 4], [1, 2]]) throws TypeError: unhashable type: 'list'. Answers to this kind of question usually involve using tuples, which are immutable and therefore hashable.

This answer to What makes lists unhashable? include the following:

If the hash value changes after it gets stored at a particular slot in the dictionary, it will lead to an inconsistent dictionary. For example, initially the list would have gotten stored at location A, which was determined based on the hash value. If the hash value changes, and if we look for the list we might not find it at location A, or as per the new hash value, we might find some other object.

but I don't quite understand because other types that can be used for dictionary keys can be changed without issue:

>>> d = {}
>>> a = 1234
>>> d[a] = 'foo'
>>> a += 1
>>> d[a] = 'bar'
>>> d
{1234: 'foo', 1235: 'bar'}

It is obvious that if the value of a changes, it will hash to a different location in the dictionary. Why is the same assumption dangerous for a list? Why is the following an unsafe method for hashing a list, since it is what we all use when we need to anyway?

>>> class my_list(list):
...   def __hash__(self):
...     return tuple(self).__hash__()
...
>>> a = my_list([1, 2])
>>> b = my_list([3, 4])
>>> c = my_list([1, 2])
>>> foo = [a, b, c]
>>> foo
[[1, 2], [3, 4], [1, 2]]
>>> set(foo)
set([[1, 2], [3, 4]])

It seems that this solves the set() problem, why is this an issue? Lists may be mutable, but they are ordered which seems like it would be all that's needed for hashing.

like image 989
Will Avatar asked Aug 23 '16 15:08

Will


People also ask

Why list is Unhashable?

The Python TypeError: unhashable type: 'list' usually means that a list is being used as a hash argument. This error occurs when trying to hash a list, which is an unhashable object.

Why list is Unhashable and tuple is hashable?

Because lists are mutable and tuples aren't.

What does Unhashable mean in Python?

TypeError: unhashable type: 'list' usually means that you are trying to use a list as an hash argument. This means that when you try to hash an unhashable object it will result an error. For ex. when you use a list as a key in the dictionary , this cannot be done because lists can't be hashed.

How do I fix TypeError Unhashable type list in Python?

The “TypeError: unhashable type: 'list'” error is raised when you try to assign a list as a key in a dictionary. To solve this error, ensure you only assign a hashable object, such as a string or a tuple, as a key for a dictionary.


1 Answers

You seem to confuse mutability with rebinding. a += 1 assigns a new object, the int object with the numeric value 1235, to a. Under the hood, for immutable objects like int, a += 1 is just the same as a = a + 1.

The original 1234 object is not mutated. The dictionary is still using an int object with numeric value 1234 as the key. The dictionary still holds a reference to that object, even though a now references a different object. The two references are independent.

Try this instead:

>>> class BadKey:
...     def __init__(self, value):
...         self.value = value
...     def __eq__(self, other):
...         return other == self.value
...     def __hash__(self):
...         return hash(self.value)
...     def __repr__(self):
...         return 'BadKey({!r})'.format(self.value)
...
>>> badkey = BadKey('foo')
>>> d = {badkey: 42}
>>> badkey.value = 'bar'
>>> print(d)
{BadKey('bar'): 42}

Note that I altered the attribute value on the badkey instance. I didn't even touch the dictionary. The dictionary reflects the change; the actual key value itself was mutated, the object that both the name badkey and the dictionary reference.

However, you now can't access that key anymore:

>>> badkey in d
False
>>> BadKey('bar') in d
False
>>> for key in d:
...     print(key, key in d)
...
BadKey('bar') False

I have thoroughly broken my dictionary, because I can no longer reliably locate the key.

That's because BadKey violates the principles of hashability; that the hash value must remain stable. You can only do that if you don't change anything about the object that the hash is based on. And the hash must be based on whatever makes two instances equal.

For lists, the contents make two list objects equal. And you can change those, so you can't produce a stable hash either.

like image 107
Martijn Pieters Avatar answered Oct 25 '22 07:10

Martijn Pieters