Is python's hash
function portable?
By "portable" I mean, will it return the same results (for the same data) across python versions, platforms and implementations?
If not, is there any alternative to it that provides such features (while still capable of hashing common data-structures)?
The documentation is not particularly helpful. This question refers a library that seems to roll its own version, but I'm not sure non-portability would be the reason for it.
No, hash()
is not guaranteed to be portable.
Python 3.3 also uses hash randomisation by default, where certain types are hashed with a hash seed picked at start-up. Hash values then differ between Python interpreter invocations.
From the object.__hash__()
documenation:
By default, the
__hash__()
values of str, bytes and datetime objects are “salted” with an unpredictable random value. Although they remain constant within an individual Python process, they are not predictable between repeated invocations of Python.This is intended to provide protection against a denial-of-service caused by carefully-chosen inputs that exploit the worst case performance of a dict insertion, O(n^2) complexity. See http://www.ocert.org/advisories/ocert-2011-003.html for details.
Changing hash values affects the iteration order of dicts, sets and other mappings. Python has never made guarantees about this ordering (and it typically varies between 32-bit and 64-bit builds).
See also PYTHONHASHSEED.
Python 2.6.8 and 3.2.3 and newer support the same feature but have it normally disabled.
Python 3.2 introduced a sys.hash_info
named tuple that gives you details about the hash implementation for the current interpreter.
If you need a portable hash, there are plenty of implementations. The standard library includes a cryptographic hash library called hashlib
; these implementations are definitely portable. Another option would be the mm3
package which provides Murmur3 non-cryptographic hash function implementations.
Common data structures would need to be converted to bytes first; you could use serialisation for that, like the json
or pickle
modules.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With