I'm thinking to create a checksum of a dict to know if it was modified or not For the moment i have that:
>>> import hashlib
>>> import pickle
>>> d = {'k': 'v', 'k2': 'v2'}
>>> z = pickle.dumps(d)
>>> hashlib.md5(z).hexdigest()
'8521955ed8c63c554744058c9888dc30'
Perhaps a better solution exists?
Note: I want to create an unique id of a dict to create a good Etag.
EDIT: I can have abstract data in the dict.
On the other hand, the main use cases of the Python hash function is to compare dictionary keys during a lookup. Anything that is hashable can be used as a key in a dictionary, for example {(1,2): "hi there"} . This situation sets us up for a simple MD5 based hashing of dictionaries.
Use the hashlib. md5() Function to Generate and Check the checksum of an MD5 File in Python. The hashlib module is utilized to implement a common interface for several different message digest and secure hash algorithms.
So, there you have it: Python uses SipHash because it's a trusted, cryptographic hash function that should prevent collision attacks.
Python dict uses open addressing to resolve hash collisions (see dictobject. c:296-297). In open addressing, hash collisions are resolved by probing (explained below) . The hash table is just a contiguous block of memory (like an array, so you can do O(1) lookup by index).
Something like this:
reduce(lambda x,y : x^y, [hash(item) for item in d.items()])
Take the hash of each (key, value) tuple in the dict and XOR them alltogether.
@katrielalex If the dict contains unhashable items you could do this:
hash(str(d))
or maybe even better
hash(repr(d))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With