Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How safely can I assume unicity of a part of SHA1 hash?

Tags:

security

sha1

I'm currently using a SHA1 to somewhat shorten an url:

Digest::SHA1.hexdigest("salt-" + url)

How safe is it to use only the first 8 characters of the SHA1 as a unique identifier, like GitHub does for commits apparently?

like image 699
Thibaut Barrère Avatar asked Mar 22 '11 08:03

Thibaut Barrère


2 Answers

To calculate the probability of a collision with a given length and the number of hashes that you have, see the birthday problem. I don't know the number of hashes that you are going to have, but here are some examples. 8 hexadecimal characters is 32 bits, so for about 100 hashes the probability of a collision is about 1/1,000,000, for 10,000 hashes it's about 1/100, for 100,000 it's 3/4 etc.

See the table in the Birthday attack article on Wikipedia to find a good hash length that would satisfy your needs. For example if you want the collision to be less likely than 1/1,000,000,000 for a set of more than 100,000 hashes then use 64 bits, or 16 hexadecimal digits.

It all depends on how many hashes are you going to have and what probability of a collision are you willing to accept (because there is always some probability, even if insanely small).

like image 76
rsp Avatar answered Nov 22 '22 05:11

rsp


If you're talking about a SHA-1 in hexadecimal, then you're only getting 4 bits per character, for a total of 32 bits. The chances of a collision are inversely proportional to the square root of that maximum value, so about 1/65536. If your URL shortener gets used much, it probably won't take terribly long before you start to see collisions.

As for alternatives, the most obvious is probably to just maintain a counter. Since you need to store a table of URLs to translate your shortened URL back to the original, you basically just store each new URL in your table. If it was already present, you give its existing number. Otherwise, you insert it and give it a new number. Either way, you give that number to the user.

like image 27
Jerry Coffin Avatar answered Nov 22 '22 04:11

Jerry Coffin