Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hashing tuple in Python causing different results in different systems

Tags:

python

hash

I was practicing tuple hashing. In there I was working on Python 2.7. Below is the code:

num = int(raw_input())
num_list = [int(x) for x in raw_input().split()]
print(hash(tuple(num_list)))

The above code results in

>>> 2
>>> 1 2
>>> 3713081631934410656

But at my local PC where I am using Python 3.4 the answer is

>>> 1299869600

The code is accepted but I could not find out what causes the different results. Is this for different version of Python?

like image 250
Solaman Raji Avatar asked Dec 03 '15 06:12

Solaman Raji


People also ask

Is Python hash function consistent?

__hash__() special method documentation: Note: By default, the __hash__() values of str, bytes and datetime objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.

Can you hash a tuple in Python?

The hash() function can work on some datatypes like int, float, string, tuples etc, but some types like lists are not hashable. As lists are mutable in nature, we cannot hash it.

What is hash function in tuple in Python?

Python hash() function is a built-in function and returns the hash value of an object if it has one. The hash value is an integer which is used to quickly compare dictionary keys while looking at a dictionary.

What is __ hash __ Python?

The hash() function accepts an object and returns the hash value as an integer. When you pass an object to the hash() function, Python will execute the __hash__ special method of the object. It means that when you pass the p1 object to the hash() function: hash(p1) Code language: Python (python)


Video Answer


1 Answers

The hash() may return different values for the same object on different OS, architectures, Python implementations and Python versions.

It is designed to be used only within a single Python session, not across sessions or machines. So you should never rely on the value of hash() beyond this.

If you need hashing that yields the same results everywhere, consider checksums such as:

  • MD5 or SHA1,
  • xxHash which per its author provides stable results across multiple OS and architecture, be it little or big endian, 32/64 bits, posix or not, etc.)
  • or with some caution Murmur as some versions may yield different results on different architectures. For instance I experienced this fist hand when porting a C Murmur2 to an IBM S390 Linux install (of all odd places!). To avoid issues I ended instead coding a slow but arch-independent pure Python implementation on that OS rather than a C implementation.
like image 112
Philippe Ombredanne Avatar answered Oct 19 '22 14:10

Philippe Ombredanne