Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What hashing algorithm does memcached use to hash keys?

Memcached uses distributed consistent hashing to choose which server to put a key on but which hashing algo does it use to map string key into the final hash on which the Ketama algo is applied for server selection. And how good is that algo at spreading similar keys to different servers.

like image 761
Usman Ismail Avatar asked Oct 12 '25 15:10

Usman Ismail


1 Answers

According to the source code in hash.c, memcached uses the following algorithm:

The hash function used here is by Bob Jenkins, 1996:

http://burtleburtle.net/bob/hash/doobs.html

"By Bob Jenkins, 1996. [email protected]. You may use this code any way you wish, private, educational, or commercial. It's free."

From Bob Jenkins' website:

I offer you a new hash function for hash table lookup that is faster and more thorough than the one you are using now. I also give you a way to verify that it is more thorough.

Also, his requirements are:

  • The keys are unaligned variable-length byte arrays.
  • Sometimes keys are several such arrays.
  • Sometimes a set of independent hash functions were required.
  • Average key lengths ranged from 8 bytes to 200 bytes.
  • Keys might be character strings, numbers, bit-arrays, or weirder things.
  • Table sizes could be anything, including powers of 2.
  • The hash must be faster than the old one.
  • The hash must do a good job.

...

The real requirement then is that a good hash function should distribute hash values uniformly for the keys that users actually use.

To get back to your other question, he measured the ability of the algorithm to uniformly distribute hash values, so I would presume that the hash does a good job at spreading similar keys to different servers. If you have concerns, the code is isolated so you should be able to run your own tests.

like image 61
Justin Ethier Avatar answered Oct 16 '25 12:10

Justin Ethier