Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash function for collection of items that disregards ordering

I am using a the hash() function to get the hash value of my object which contains two integers and two Strings. Moreover, I have a dictionary where I store these objects; the process is that I check if the object exists with the hash value, if yes I update if not I insert the new one.

The thing is that when creating the objects, I do not know the order of the object variables and I want to treat the objects as same no matter the order of these variables.

Is there an alternative function to the hash() function that does not consider the order of the variables?

#Consequently what I want is:
hash((int1,str1,int2,str2)) == hash((int2,str2,int1,str1)) 
like image 590
Grzegorz Avatar asked Dec 06 '22 15:12

Grzegorz


1 Answers

You could use a frozenset instead of a tuple:

>>> hash(frozenset([1, 2, 'a', 'b']))
1190978740469805404
>>>
>>> hash(frozenset([1, 'a', 2, 'b']))
1190978740469805404
>>>
>>> hash(frozenset(['a', 2, 'b', 1]))
1190978740469805404

However, the removal of duplicates from the iterable presents a subtle problem:

>>> hash(frozenset([1,2,1])) == hash(frozenset([1,2,2]))
True

You can fix this by creating a counter from the iterable using collections.Counter, and calling frozenset on the counter's items, thus preserving the count of each item from the original iterable:

>>> from collections import Counter
>>>
>>> hash(frozenset(Counter([1,2,1]).items())) 
-307001354391131208
>>> hash(frozenset(Counter([1,1,2]).items()))
-307001354391131208
>>> 
>>> hash(frozenset(Counter([1,2,1]).items())) == hash(frozenset(Counter([1,2,2]).items()))
False
like image 146
Moses Koledoye Avatar answered Feb 25 '23 20:02

Moses Koledoye