Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

6 Character Short Hash Algorithm

Tags:

c#

hash

md5

sha

My goal is to generate a short Hash string of 6 characters (possibly containing characters [A-Z][a-z][0-9]) for a string which is 42 case-insensitive alphanumeric characters in length. Uniqueness is the key requirement. Security or performance is not so important.

Is there a specific algorithm which will give this result or should I stick to truncating a MD5 Hash or a SHA-1 Hash (Like in this question)? If so, what is the probability of a collision?

like image 732
Isuru Avatar asked Aug 27 '13 11:08

Isuru


People also ask

How many hashes are in 8 bit output?

#3 If you have an 8 bit hash output, how many possible hashes are there? There are 28 possibles hashes.

What hash is 40 characters long?

SHA-1 is a hash function designed by the N.S.A. ("SHA" = "Secure Hash Algorithm"). It returns a 160-bit hexadecimal string which is 40 characters long. It was used for secure encryption from 1996 to 2010, largely as a replacement for MD5, but now it is used mostly for data integrity.

How many bytes is SHA-1?

In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits.

What are the lengths of the hash values?

The hashing functions return a 128-bit, 160-bit, 256-bit, or 512-bit hash of the input data, depending on the algorithm selected.


1 Answers

Your best bet would be the truncating well-known hash function (MD5 or SHA-family) because these algorithms have statistically good uniform distributions of the hash values (and also using full hash and not just 6 chars there).

Now some calculations for probability of collision

- Number of letters in English alphabet: 26
- Add capitals: 26
- Add numerics: 10
--------------

In total you get 26 + 26 + 10 = 62 characters. 

Now you have 6 places, which gives you 62^6 possible combinations.
That is 56.800.235.584 ~ 57 billion combinations. 
This is a space of possible hash values - N.
--------------
To compute collisions let's use the formula 

Pcollision = K^2 / 2N

Which is a very rough approximation of collision probability

Now let's see the result table for a number of items in a table - K

# items     | Probability of collision
---------------------------------------
10          |  1.7 * 10^-9
100         |  1.7 * 10^-7
1K          |  1.7 * 10^-5
10K         |  1.7 * 10^-3
100K        |  0.17

This formula can only be used for small K, but it shows that given 100K entries in the hash table you would roughly have 17% chance of collision.

Links

Collision probability

like image 151
oleksii Avatar answered Sep 23 '22 11:09

oleksii