Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are SHA1 hashes distributed uniformly?

I have a string in Python. I calculate the SHA1 hash of that string with hashlib. I convert it to its hexadecimal representation and take the last 16 characters to use as an identifier:

hash_str = "foobarbazάλφαβήταγάμμα..."
hash_obj = hashlib.sha1(hash_str, encode('utf-8'))
hash_id  = hash_obj.hexdigest()[:16]

My goal is an identifier that provides reasonable length and is unlikely to yield the same hash_id value for a different hash_str input.

If the probability of a SHA1 collision is 1/(2^160), or 1/(16^40), then if I take the last sixteen characters of the hex representation, is the probability of a collision only 1/(16^16)? Or are the bytes (or their hex equivalent) not distributed evenly?

like image 492
Alex Reynolds Avatar asked Nov 06 '15 00:11

Alex Reynolds


People also ask

Are hashes uniformly distributed?

So yes, hash functions should have uniformly distributed values.

Is SHA-1 hash always the same?

SHA1 is a message hashing or digest algorithm where it generates a 160-bit unique value from the input. The size of the input does not matter, because SHA1 always generates a message hash of the same size, which is 160 bits.

Is MD5 evenly distributed?

Another benefit of using a hashing algorithm like MD5 is that the resulting hashes have a known even distribution, meaning your ids will be evenly distributed without worrying about keeping the id values themselves evenly distributed.

What type of encryption is SHA-1?

SHA-1 or Secure Hash Algorithm 1 is a cryptographic hash function which takes an input and produces a 160-bit (20-byte) hash value. This hash value is known as a message digest. This message digest is usually then rendered as a hexadecimal number which is 40 digits long.


1 Answers

Yes. Any hash function which exhibits the property of uniformity has equal chance of any value in its output range being generated by a randomly chosen input value. Therefore, each value of the truncated hash is equally likely too. SHA-1 is is hash function that demonstrates uniformity, therefore your conjecture is true.

like image 85
abligh Avatar answered Sep 21 '22 13:09

abligh