Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inconsistent hash output on tuples

Comparing hashes for some two-element tuples

for i in range(11):
    print(i, hash((i,i)) == hash((-i,-i)))

I expected to get True for when i==0 and False for the rest. I was surprised to see this:

0 True
1 False
2 True
3 True
4 True
5 True
6 True
7 True
8 False
9 True
10 True

Why is this happening?

AFAIK it is not the same issue as in this question, since it's not about order but the values themselves.

like image 683
alex Avatar asked Oct 24 '25 21:10

alex


1 Answers

Hash values are never guaranteed to be collision-free even though good hashing algorithms strive to make them so. Python 3.8 has improved on the hashing algorithm particularly for tuples so the issue in the question has now become much harder to reproduce in Python 3.8 compared to Python 3.7.

Excerpt from the changelog for Python 3.8.0 alpha 1:

bpo-34751: The hash function for tuples is now based on xxHash which gives better collision results on (formerly) pathological cases. Additionally, on 64-bit systems it improves tuple hashes in general. Patch by Jeroen Demeyer with substantial contributions by Tim Peters.

like image 83
blhsing Avatar answered Oct 26 '25 11:10

blhsing