Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate consistent hash of a UUID

I want to generate a consistent hash of a UUID string such as dcc549d8-bd0c-49c2-bff8-f6fa80fb7857, preferably a number between 0 to N.

What is the best and fastest way to do it?

Update: I am thinking of using CRC32. Any pros/cons of it?

like image 841
Madhur Ahuja Avatar asked Jan 10 '20 17:01

Madhur Ahuja


People also ask

How do you achieve consistent hashing?

Consistent hashing optimization The aim is just to ensure each node is responsible for an equal portion of the ring, so that load is evenly distributed. (Another advantage of having multiple hashes for each node is that the hashes can be added to or removed from the ring gradually to avoid sudden spikes of load. )

Is MD5 consistent hashing?

So it is "consistent". However, MD5 takes as input an arbitrary sequence of bits, and outputs a sequence of 128 bits. In many situations, you want strings. For instance, you want to hash a password, and the password is initially a string.

What is consistent in consistent hashing?

In computer science, consistent hashing is a special kind of hashing technique such that when a hash table is resized, only keys need to be remapped on average where is the number of keys and. is the number of slots.

What is hash ring?

Hash rings take the leading bits of a document's hash value and use this to determine which node the document should be assigned. This allows any node in a cluster to know what node the data lives on and how to adapt to new assignment methods as your data grows.


3 Answers

What kind of hash would you like? The 'best' choice might not be fastest and will depend on what you're using the hash for.

For md5, you could do:

var crypto = require('crypto');

var md5sum = crypto.createHash('md5');
md5sum.update(uuid);
var b64 = md5sum.digest('base64')

You could then use a base64 library to convert it to a number if that's what you need.

Node crypto stuff, including other hashing algorithms that might be more appropriate for your case (md5 is faster but less secure), is documented here: https://nodejs.org/api/crypto.html

like image 79
jack Avatar answered Nov 11 '22 20:11

jack


If thinking about the nature of UUID example. I would go in that direction.

const INIT_NUMBER = 271;

function hash(uuid, N) {
  const x = uuid.split("-").reduce((a,b) => a ^ Number.parseInt(b, 16), INIT_NUMBER) ;
  return arguments.length === 1 ? x : x % N;
}

const a = hash("dcc549d8-bd0c-49c2-bff8-f6fa80fb7857");            
const b = hash("dcc549d8-bd0c-49c2-bff8-f6fa80fb7857", 256);

console.log(a, b);
  
  
like image 31
Yaroslav Gaponov Avatar answered Nov 11 '22 19:11

Yaroslav Gaponov


If you're using v4 UUID generation, all digits except for two already contain pseudorandom values, all you need to do is extract them to the form you want.

From the Spec:

4.4.  Algorithms for Creating a UUID from Truly Random or
      Pseudo-Random Numbers

   The version 4 UUID is meant for generating UUIDs from truly-random or
   pseudo-random numbers.

   The algorithm is as follows:

   o  Set the two most significant bits (bits 6 and 7) of the
      clock_seq_hi_and_reserved to zero and one, respectively.

   o  Set the four most significant bits (bits 12 through 15) of the
      time_hi_and_version field to the 4-bit version number from
      Section 4.1.3.

   o  Set all the other bits to randomly (or pseudo-randomly) chosen
      values.

To extract the pseudorandom values, we simply remove the first hex character (which contains the 4 most significant bits) of the third and fourth sections separated by "-" and convert it to the base you want. Because a UUID is always the same length, we can remove the same indexes each time (14 and 19).

Unfortunately, as Javascript only supports up to 32-bit integers, we need to feed Number.parseInt groups of 8 hex characters (32 bits) separately, then add the modulos to minimize bias.

Therefore, our code would be:

function parseUUID(uuid, N) {
    var pseudorandomBytes = uuid.substring(0, 14) + uuid.substring(15, 19) + uuid.substring(20); // Splice out the bytes which contain metadata
    pseudorandomBytes = pseudorandomBytes.replace(/-/g, ""); // Remove all dashes
    var accumulator = 0; // Accumulate the sums modulo N of every eight hex characters and the remainder
    pseudorandomBytes.match(/.{1,8}/g).forEach(function (a) { accumulator = (accumulator + (Number.parseInt(a, 16) % N)) % N; });
    return accumulator; // Return the result
}
like image 40
id01 Avatar answered Nov 11 '22 21:11

id01