Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hashing a tuple in Python where order matters?

Tags:

python

I have:

tuple1 = token1, token2
tuple2 = token2, token1
for tuple in [tuple1, tuple2]:
    if tuple in dict:
        dict[tuple] += 1
    else:
        dict[tuple] = 1

However, both tuple 1 and tuple2 are getting the same counts. What is a way to hash a group of 2 things such that order matters?

like image 359
Shazam Avatar asked Jan 17 '13 21:01

Shazam


People also ask

Does order matter in tuples Python?

The order of the elements in a list is an intrinsic property of that list and does not change, unless the list itself is modified. (The same is true of tuples, except of course they can't be modified.) The next tutorial will introduce you to the Python dictionary: a composite data type that is unordered. Read on!

Can tuple be hashed in Python?

Practical Data Science using PythonWe have to find the hash value of this tuple by using hash() function. This is a built-in function. The hash() function can work on some datatypes like int, float, string, tuples etc, but some types like lists are not hashable. As lists are mutable in nature, we cannot hash it.

Why tuple is ordered in Python?

In Lists and Tuples, the items are retained in the order in which they are inserted. The elements can always be accessed based on their position. The element at the position or 'index' 0 will always be at index 0. Therefore, the Lists and Tuples are said to be ordered collections.

What is hash function in tuple?

The hash() function accepts a single object as a parameter and returns a hash which is an integer value. hash() function only works on immutable objects such as int, string, tuple, float, long, Unicode, bool. Mutable objects such as list, dict, set, bytearray are non-hashable. Example. Copy Code.


1 Answers

Order is taken into account when hashing:

>>> hash((1,2))
1299869600
>>> hash((2,1))
1499606158

This assumes that the objects themselves have unique hashes. Even if they don't, you could still be OK when using it in a dictionary (as long as the objects themselves aren't equal as defined by their __eq__ method):

>>> t1 = 'a',hash('a') 
>>> [hash(x) for x in t1]  #both elements in the tuple have same hash value since `int` hash to themselves in cpython
[-468864544, -468864544]
>>> t2 = hash('a'),'a'
>>> hash(t1)
1486610051
>>> hash(t2)
1486610051
>>> d = {t1:1,t2:2}  #This is OK.  dict's don't fail when there is a hash collision
>>> d
{('a', -468864544): 1, (-468864544, 'a'): 2}
>>> d[t1]+=7
>>> d[t1]
8
>>> d[t1]+=7
>>> d[t1]
15
>>> d[t2]   #didn't touch d[t2] as expected.
2

Note that due to hash collisions, this dict is likely to be less efficient than another dict where there aren't hash collisions :)

like image 79
mgilson Avatar answered Oct 03 '22 03:10

mgilson