Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is hash() slower under python3.4 vs python2.7

I was doing some performance evaluation using timeit and discovered a performance degredation between python 2.7.10 and python 3.4.3. I narrowed it down to the hash() function:

python 2.7.10:

>>> import timeit
>>> timeit.timeit('for x in xrange(100): hash(x)', number=100000)
0.4529099464416504
>>> timeit.timeit('hash(1000)')
0.044638872146606445

python 3.4.3:

>>> import timeit
>>> timeit.timeit('for x in range(100): hash(x)', number=100000)
0.6459149940637872
>>> timeit.timeit('hash(1000)')
0.07708719989750534

That's an approx. 40% degradation! It doesn't seem to matter if integers, floats, strings(unicodes or bytearrays), etc, are being hashed; the degradation is about the same. In both cases the hash is returning a 64-bit integer. The above was run on my Mac, and got a smaller degradation (20%) on an Ubuntu box.

I've also used PYTHONHASHSEED=random for the python2.7 tests and in some cases, restarting python for each "case", I saw the hash() performance get a bit worse, but never as slow as python3.4

Anyone know what's going on here? Was a more-secure, but slower, hash function chosen for python3 ?

like image 536
Chris Cogdon Avatar asked Oct 19 '16 16:10

Chris Cogdon


People also ask

Is hash always the same Python?

Note: By default, the __hash__() values of str, bytes and datetime objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.

How does Python compare hash values?

Python hashable In order to perform comparisons, a hashable needs an __eq__ method. Note: Hashable objects which compare equal must have the same hash value. Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

Is Python hash function stable?

Is Python hash function stable? hash(): not stable, too restrictive Also, hash() only supports hashable objects; this means no lists, dicts, or non-frozen dataclasses.

What is Hashlib SHA256?

Using Python hashlib to Implement SHA256. Python has a built-in library, hashlib , that is designed to provide a common interface to different secure hashing algorithms. The module provides constructor methods for each type of hash. For example, the . sha256() constructor is used to create a SHA256 hash.


1 Answers

There are two changes in hash() function between Python 2.7 and Python 3.4

  1. Adoptions of SipHash
  2. Default enabling of Hash randomization

References:

  • Since from Python 3.4, it uses SipHash for it's hashing function. Read: Python adopts SipHash
  • Since Python 3.3 Hash randomization is enabled by default. Reference: object.__hash__ (last line of this section). Specifying PYTHONHASHSEED the value 0 will disable hash randomization.
like image 101
Moinuddin Quadri Avatar answered Oct 18 '22 18:10

Moinuddin Quadri