Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does hashing work for python sets [duplicate]

I am completely familiar on how hashtables and hashes work but I am trying to fully understand how the O(1) completely comes from.

set1 = {'s','t'}
print('x' in set1)
print('s' in set1)
set2 = {'s'}
print('s' in set2)

I am told that to check if 's' is in set1, if will check the memory allocation of the hash of 's', and check if it is in set1 in O(1) and return the boolean. Therefore two O(1) operations, but my question is: How does the hashes actually work indepth. What I mean by this is, when you hash 's', does that hash have something like set1 and set2 and you're checking if set1 is either set1 or set2, or does each set have a different hash of 's' and you're checking the hash of 's' for each different set.

like image 353
memelord23 Avatar asked Jun 30 '16 12:06

memelord23


2 Answers

An (immutable) object will always have the same hash for its entire lifetime. The hash will be the same between set1 and set2 -- This allows for operations like union and intersection to take place between sets, as well as determining if a member is unique within the set, among other things.

>>> first_dict = {"s":"s".__hash__(), "t":"t".__hash__()}
>>> second_dict = {"s":"s".__hash__()}
>>> first_dict
{'s': -638743784, 't': 711199131}
>>> second_dict
{'s': -638743784}

You can have identical strings that are different objects in memory. But since they're identical strings, they will hash the same.

>>> id(a)
7858120
>>> id(b)
7858624
>>> a.__hash__()
-1117164856
>>> b.__hash__()
-1117164856

An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

All of Python’s immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default; they all compare unequal (except with themselves), and their hash value is derived from their id().

like image 99
sytech Avatar answered Oct 11 '22 14:10

sytech


I suggest you read this previous question, It will explain in details the hashing in Python.

like image 35
Gal Dreiman Avatar answered Oct 11 '22 14:10

Gal Dreiman