Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash function that protects against collisions, not attacks. (Produces a random UUID-size result space)

Using SHA1 to hash down larger size strings so that they can be used as a keys in a database.

Trying to produce a UUID-size string from the original string that is random enough and big enough to protect against collisions, but much smaller than the original string.

Not using this for anything security related.

Example:

# Take a very long string, hash it down to a smaller string behind the scenes and use
#     the hashed key as the data base primary key instead
def _get_database_key(very_long_key):
    return hashlib.sha1(very_long_key).digest()

Is SHA1 a good algorithm to be using for this purpose? Or is there something else that is more appropriate?

like image 314
Chris Dutrow Avatar asked Mar 03 '13 07:03

Chris Dutrow


People also ask

Is a UUID a hash?

Version-3 and version-5 UUIDs are generated by hashing a namespace identifier and name. Version 3 uses MD5 as the hashing algorithm, and version 5 uses SHA-1.

What approaches may be used to avoid collisions in hash functions?

Chaining is a technique used for avoiding collisions in hash tables. A collision occurs when two keys are hashed to the same index in a hash table.

Can a hash function have no collisions?

It's not possible to avoid collisions with a hash. If you have no collisions then you don't have a hashing function. The goal is to minimize collisions, not eliminate them. You'll always have contention unless you have more possible hashes than possible inputs, which sort of defeats the point of hashing.

How do you reduce collisions in hashing?

Doubling the size of the table will halve the expected number of collisions. The latter strategy gives rise to an important property of hash tables that we have not seen in any other data structure.


1 Answers

Python has a uuid library, based on RFC 4122.

The version that uses SHA1 is UUIDv5, so the code would be something like this:

import uuid

uuid.uuid5(uuid.NAMESPACE_OID, 'your string here')
like image 156
Ja͢ck Avatar answered Sep 25 '22 16:09

Ja͢ck