Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do python sets hold False and Zero exclusively?

Tags:

python

set

when creating a set:

>>> falsey_set = {0, '', False, None}  # set([False, '', None])
>>> falsey_set = {False, '', 0, None}  # set([0,'', None])
>>> # adding an item to the set doesn't change anything either
>>> falsey_set.add(False)  # set([0,'',None])

or a dictionary, which mimics the behavior somewhat:

>>> falsey_dict = {0:"zero", False:"false"}  # {0:'false'}  # that's not a typo
>>> falsey_dict = {False:'false', 0:'zero'}  # {False: 'zero'} # again, not a typo
>>> falsey_set.add(())  # set([0,'', None, ()])
>>> falsey_set.add({})  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> falsey_dict[()] = 'list'  # {False:'zero', ():'list'}
>>> falsey_dict({}) = 'dict'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

0 and False always remove one another from the set. In the case of dictionaries they are incorrect altogether. Is there a reason for this? While I realize that booleans are derived from integers in Python. What's the pythonic reasoning for acting this way in the context of sets specifically (I don't care about dictionaries too much)? Since while useful in truthy comparison like:

>>> False == 0  # True

There is obvious value in differentiation:

>>> False is 0  # False

I've been looking over the documentation and can't seem to find a reference for the behavior

Update

@delnan I think you hit the nail on the head with hash determinism you've mentioned in the comments. As @mgilson notes both False and 0 use the same hashing function, however so do object and many of its subclasses (i.e.:super) that have identical hash functions. The key seems to be in the phrase Hashable objects which compare equal must have the same hash value from the documentation. Since, False == 0 and and both are hashable, their outputs must by Python's definition be equivalent. Finally, the definition of hashable states how sets use hashability in set membership with the following: Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally. While I still don't understand why they both use the same hashing function - I can settle with going this deep.

If we all agree then someone propose a polished answer, and I'll accept it. If there could be some improvement or if I'm off base then please let it be known below.

like image 806
Marc Avatar asked Oct 09 '14 22:10

Marc


1 Answers

It's because False and 0 hash to the same value and are equal.

The reason that they hash to the same value is because bool is a subclass of int so bool.__hash__ simply calls the same underlying mechanics that int.__hash__ calls...

>>> bool.__hash__ is int.__hash__
True
like image 61
mgilson Avatar answered Oct 02 '22 12:10

mgilson