Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using hashlib.sha256 to create a unique id; is this guaranteed to be unique?

I am trying to create a unique record id using the following function:

import hashlib
from base64 import b64encode

def make_uid(salt, pepper, key):
  s = b64encode(salt)
  p = b64encode(pepper)
  k = b64encode(key)
  return hashlib.sha256(s + p + k).hexdigest()

Where pepper is set like this:

uuid_pepper = uuid.uuid4()
pepper = str(uuid_pepper).encode('ascii')

And salt and key are the same values for every request.

My question is, because of the unique nature of the pepper, will make_uid in this intance always return a unique value, or is there a chance that it can create a duplicate?

The suggested answer is different because I'm not asking about the uniqueness of various uuid types, I'm wondering whether it's at all possible for a sha256 hash to create a collision between two distinct inputs.

like image 923
mwkrimson Avatar asked Dec 08 '22 18:12

mwkrimson


1 Answers

I think what you want to know is whether SHA256 is guaranteed to generate a unique hash result. The answer is yes and no. I got the following result from my research, not 100% accurate but close.

In theory, SHA256 will collide. It has 2^256 results. So if we hash 2^256 + 1 times, there must be a collision. Even worse, according to statistics, the possibility of collision within 2^130 times of hashing is 99%.

But you probably won't generate one during your lifetime. Assume we have a computer that can calculate 10,000 hashes per second. It costs this computer 4 * 10^27 years to finish 2^130 hashes. You might not have any idea about how large this number is. The number of years of doing hashing is 2 * 10^22 times of that of human exist on earth. That means that even if you started doing hashing since the first day we were on earth till now, the possibility of collision is still very very small.

Hope that answers your question.

like image 50
Junbang Huang Avatar answered Feb 06 '23 17:02

Junbang Huang