Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest hash function for pointers?

Hash table based containers are very fast associative array (e.g. unordered_map, unordered_set).

Their performance is highly dependent on that hash function used to create an index for each entry. As hash tables grow, elements are rehashed again and again.

Pointers are simple type, basically a 4/8 byte value that uniquely identify an object. The problem is that using an address as a result of the hash function is not efficient due to several LSB being zero.

Example:

struct MyVoidPointerHash {     size_t operator()(const void* val) const {         return (size_t)val;     } }; 

A faster implementation is to lose a few bits:

struct MyVoidPointerHash2 {     size_t operator()(const void* val) const {         return ((size_t)val) >> 3; // 3 on 64 bit, 1 on 32 bit     } }; 

The latter produced 10-20% performance increase on a large application that uses hash sets and maps with tens of thousands of elements that are frequently built and cleared.

Can someone offer a better scheme for hashing pointers?

The function needs to be:

  1. Fast! and must inline well.
  2. Offer a reasonable distribution, rare collisions are allowed.

Update - benchmark results

I ran two sets of tests, one for int* and for a class pointer that has a size of 4KB. The results are very interesting.

I used std::unordered_set for all test with data size being 16MB that was allocated in a single new call. The first algorithm ran twice to make sure sure caches are as hot as possible and the CPU is running at full speed.

Setup: VS2013 (x64), i7-2600, Windows 8.1 x64.

  • VS2013 default hash function
  • Hash1: return (size_t)(val);
  • Hash2: return '(size_t)(val) >> 3;
  • Hash3(@BasileStarynkevitch): uintptr_t ad = (uintptr_t)val; return (size_t)((13 * ad) ^ (ad >> 15));
  • Hash4(@Roddy): uintptr_t ad = (uintptr_t)val; return (size_t)(ad ^ (ad >> 16));
  • Hash5(@egur):

Code:

template<typename Tval> struct MyTemplatePointerHash1 {     size_t operator()(const Tval* val) const {         static const size_t shift = (size_t)log2(1 + sizeof(Tval));         return (size_t)(val) >> shift;     } }; 

Test 1 - int*:

  • VS2013 default took 1292ms
  • Hash1 took 742ms
  • Hash2 took 343ms
  • Hash3 took 1008ms
  • Hash4 took 629ms
  • Hash5 took 350ms

Test 1 - 4K_class*:

  • VS2013 default took 0.423ms
  • Hash1 took 23.889ms
  • Hash2 took 6.331ms
  • Hash3 took 0.366ms
  • Hash4 took 0.390ms
  • Hash5 took 0.290ms

Update2:

Winner so far is the templated hash (Hash5) function. Best level of performance for speed for various block sizes.

Update 3: Added default hash function for baseline. Turns out it's far from optimal.

like image 941
egur Avatar asked Jan 06 '14 15:01

egur


2 Answers

The correct answer from a theoretical point of view is: "Use std::hash which is likely specialized to be as good as it gets, and if that is not applicable, use a good hash function rather than a fast one. The speed of the hash function does not matter so much as its quality".

The practical answer is: "Use std::hash, which is piss-poor, but nevertheless performs surprisingly well."

TL;DR
After having become intrigued, I ran about 30 hours of benchmarks over the weekend. Among other things, I tried to get an average case vs. worst case and tried to coerce std::unordered_map into worst-case behavior by giving it deliberately bad hints on bucket counts in respect of the set size inserted.

I compared poor hashes (std::hash<T*>) to well-known general purpose hashes of overall good quality (djb2, sdbm) as well as variations of these that account for the very short input length, and to hashes which are explicitly thought to be used in hashtables (murmur2 and murmur3), and piss-poor hashes that are actually worse than not hashing at all since they throw away entropy.
Since the lowest 2-3 bits are always zero on pointers due to alignment, I deemed it worthwhile to test a simple right-shift as "hash", so only the non-zero information would be used, in case the hashtable e.g. used only the lowermost N bits. Turned out that for reasonable shifts (I also tried unreasonable shifts!) this actually performs quite well.

Findings

Some of my findings were well-known and unsurprising, others are very surprising:

  • It is hard to predict what is a "good" hash. Writing good hash functions is hard. Not surprising, well-known fact, and once again proven.
  • No single hash significantly outperforms all others in every scenario. No single hash even significantly outperforms all others 80% of the time. The first result was expected, the second is nevertheless surprising.
  • It is really hard to press std::unordered_map into behaving badly. Even when given deliberately bad hints to bucket counts which will force several re-hashes, the overall performance is not much worse. Only the very worst hash functions that throw away most of the entropy in an almost ridiculous way are able to significantly impact performance by more than 10-20% (such as right_shift_12, which practically results in only 12 distinct hash values for 50,000 inputs! It is not surprising that the hash map runs around 100 times slower in this case -- we're basically doing random access lookup on a linked list.).
  • Some "funny" results are surely due to implementation details. My implementation (GCC) uses a slighly-larger-than-2^N prime bucket count, and inserts values with indentical hashes head-first into linked lists.
  • The specialization of std::hash<T*> is outright pathetic for GCC (a simple reinterpret_cast). Funnily, a functor which does the identical thing consistently performs faster at insertions and slower at random access. The difference is small (a dozen milliseconds on a 8-10 second test run), but it is not noise, it's consistenly present -- probably related to instruction reordering or pipelining. It's stunning how the exact same code (which is also a no-op) consistenly performs differently in two different scenarios.
  • Pathetic hashes do not perform significantly worse than "good" hashes or hashes explicitly designed for hash tables. Indeed, half of the time, they are the best performers, or among the top 3.
  • The "best" hash functions rarely if ever result in the best overall performance.
  • The hashes posted as answers in this SO question are generally OK. They are good average, but not superior to std::hash. Usually they'll land in the top 3-4.
  • Poor hashes are somewhat vulnerable to the order of insertion (performing worse on random insertion and random lookup following random insertion) whereas "good" hashes are more resilient to the impact of the order of insertion (little or no difference), but overall performance is still slightly slower.

Test Setup

Test were done not just any 4-byte or 8-byte (or whatever) aligned values, but for actual addresses obtained from allocating the complete set of elements on the heap, and storing the addresses as provided by the allocator in a std::vector (the objects were then deleted, they're not needed).
Addresses were inserted into a newly allocated std::unordered_map for every test in the order stored in the vector, once in the original order ("sequential") and once after applying a std::random_shuffle on the vector.

Tests were performed for sets of 50,000 and 1,000,000 objects of size 4, 16, 64, 256, and 1024 (results for 64 omitted here for brevity, they're as you'd expect somewhere in the middle between 16 and 256 -- StackOverflow only allows 30k characters being posted).
Test suite was performed 3 times, results varying by 3 or 4 milliseconds here and there, but overall identical. The results posted here are the last run.

The order of insertions in the "random" test as well as the access pattern (in every test) is pseudorandom, but exactly identical for every hash function in a test run.

The timings under hash benchmarks are for summing up 4,000,000,000 hash values in an integer variable.

The column insert is the time in milliseconds for 50 iterations of creating a std::unordered_map, inserting 50,000 and 1,000,000 elements respectively, and destroying the map.

The column access is the time in milliseconds to do 100,000,000 lookups of a pseudorandom element in the 'vector' followed by looking up that address in the unordered_map.
This timing includes on the average one cache misse for accessing a random element in the vector, at least for the large dataset (small dataset fits completely into L2).

All timings on a 2.66GHz Intel Core2, Windows 7, gcc 4.8.1/MinGW-w64_32. Timer granularity @ 1ms.

Source Code

Source code is available on Ideone, again because of Stackoverflow's 30k character limit.

Note: Running the complete test suite takes well over 2 hours on a desktop PC, so be prepared to take a walk if you want to reproduce the results.

Test Results

Benchmarking hash funcs... std::hash                 2576       reinterpret_cast          2561       djb2                      13970      djb2_mod                  13969      sdbm                      10246      yet_another_lc            13966      murmur2                   11373      murmur3                   15129      simple_xorshift           7829       double_xorshift           13567      right_shift_2             5806       right_shift_3             5866       right_shift_4             5705       right_shift_5             5691       right_shift_8             5795       right_shift_12            5728       MyTemplatePointerHash1    5652       BasileStarynkevitch       4315        -------------------------------- sizeof(T)       = 4 sizeof(T*)      = 4 insertion order = sequential  dataset size    =    50000 elements name                      insert     access     std::hash                 421        6988        reinterpret_cast          408        7083        djb2                      451        8875        djb2_mod                  450        8815        sdbm                      455        8673        yet_another_lc            443        8292        murmur2                   478        9006        murmur3                   490        9213        simple_xorshift           460        8591        double_xorshift           477        8839        right_shift_2             416        7144        right_shift_3             422        7145        right_shift_4             414        6811        right_shift_5             425        8006        right_shift_8             540        11787       right_shift_12            1501       49604       MyTemplatePointerHash1    410        7138        BasileStarynkevitch       445        8014         -------------------------------- sizeof(T)       = 4 sizeof(T*)      = 4 insertion order = random  dataset size    =    50000 elements name                      insert     access     std::hash                 443        7570        reinterpret_cast          436        7658        djb2                      473        8791        djb2_mod                  472        8766        sdbm                      472        8817        yet_another_lc            458        8419        murmur2                   479        9005        murmur3                   491        9205        simple_xorshift           464        8591        double_xorshift           476        8821        right_shift_2             441        7724        right_shift_3             440        7716        right_shift_4             450        8061        right_shift_5             463        8653        right_shift_8             649        16320       right_shift_12            3052       114185      MyTemplatePointerHash1    438        7718        BasileStarynkevitch       453        8140         -------------------------------- sizeof(T)       = 4 sizeof(T*)      = 4 insertion order = sequential  dataset size    =  1000000 elements name                      insert     access     std::hash                 8945       32801       reinterpret_cast          8796       33251       djb2                      11139      54855       djb2_mod                  11041      54831       sdbm                      11459      36849       yet_another_lc            14258      57350       murmur2                   16300      39024       murmur3                   16572      39221       simple_xorshift           14930      38509       double_xorshift           16192      38762       right_shift_2             8843       33325       right_shift_3             8791       32979       right_shift_4             8818       32510       right_shift_5             8775       30436       right_shift_8             10505      35960       right_shift_12            30481      91350       MyTemplatePointerHash1    8800       33287       BasileStarynkevitch       12885      37829        -------------------------------- sizeof(T)       = 4 sizeof(T*)      = 4 insertion order = random  dataset size    =  1000000 elements name                      insert     access     std::hash                 12183      33424       reinterpret_cast          12125      34000       djb2                      22693      51255       djb2_mod                  22722      51266       sdbm                      15160      37221       yet_another_lc            24125      51850       murmur2                   16273      39020       murmur3                   16587      39270       simple_xorshift           16031      38628       double_xorshift           16233      38757       right_shift_2             11181      33896       right_shift_3             10785      33660       right_shift_4             10615      33204       right_shift_5             10357      38216       right_shift_8             15445      100348      right_shift_12            73773      1044919     MyTemplatePointerHash1    11091      33883       BasileStarynkevitch       15701      38092        -------------------------------- sizeof(T)       = 64 sizeof(T*)      = 4 insertion order = sequential  dataset size    =    50000 elements name                      insert     access     std::hash                 415        8243        reinterpret_cast          422        8321        djb2                      445        8730        djb2_mod                  449        8696        sdbm                      460        9439        yet_another_lc            455        9003        murmur2                   475        9109        murmur3                   482        9313        simple_xorshift           463        8694        double_xorshift           465        8900        right_shift_2             416        8402        right_shift_3             418        8405        right_shift_4             423        8366        right_shift_5             421        8347        right_shift_8             453        9195        right_shift_12            666        18008       MyTemplatePointerHash1    433        8191        BasileStarynkevitch       466        8443         -------------------------------- sizeof(T)       = 64 sizeof(T*)      = 4 insertion order = random  dataset size    =    50000 elements name                      insert     access     std::hash                 450        8135        reinterpret_cast          457        8208        djb2                      470        8736        djb2_mod                  476        8698        sdbm                      483        9420        yet_another_lc            476        8953        murmur2                   481        9089        murmur3                   486        9283        simple_xorshift           466        8663        double_xorshift           468        8865        right_shift_2             456        8301        right_shift_3             456        8302        right_shift_4             453        8337        right_shift_5             457        8340        right_shift_8             505        10379       right_shift_12            1099       34923       MyTemplatePointerHash1    464        8226        BasileStarynkevitch       466        8372         -------------------------------- sizeof(T)       = 64 sizeof(T*)      = 4 insertion order = sequential  dataset size    =  1000000 elements name                      insert     access     std::hash                 9548       35362       reinterpret_cast          9635       35869       djb2                      10668      37339       djb2_mod                  10763      37311       sdbm                      11126      37145       yet_another_lc            11597      39944       murmur2                   16296      39029       murmur3                   16432      39280       simple_xorshift           16066      38645       double_xorshift           16108      38778       right_shift_2             8966       35953       right_shift_3             8916       35949       right_shift_4             8973       35504       right_shift_5             8941       34997       right_shift_8             9356       31233       right_shift_12            13831      45799       MyTemplatePointerHash1    8839       31798       BasileStarynkevitch       15349      38223        -------------------------------- sizeof(T)       = 64 sizeof(T*)      = 4 insertion order = random  dataset size    =  1000000 elements name                      insert     access     std::hash                 14756      36237       reinterpret_cast          14763      36918       djb2                      15406      38771       djb2_mod                  15551      38765       sdbm                      14886      37078       yet_another_lc            15700      40290       murmur2                   16309      39024       murmur3                   16432      39381       simple_xorshift           16177      38625       double_xorshift           16073      38750       right_shift_2             14732      36961       right_shift_3             14170      36965       right_shift_4             13687      37295       right_shift_5             11978      35135       right_shift_8             11498      46930       right_shift_12            25845      268052      MyTemplatePointerHash1    10150      32046       BasileStarynkevitch       15981      38143        -------------------------------- sizeof(T)       = 256 sizeof(T*)      = 4 insertion order = sequential  dataset size    =    50000 elements name                      insert     access     std::hash                 432        7957        reinterpret_cast          429        8036        djb2                      462        8970        djb2_mod                  453        8884        sdbm                      460        9110        yet_another_lc            466        9015        murmur2                   495        9147        murmur3                   494        9300        simple_xorshift           479        8792        double_xorshift           477        8948        right_shift_2             430        8120        right_shift_3             429        8132        right_shift_4             432        8196        right_shift_5             437        8324        right_shift_8             425        8050        right_shift_12            519        11291       MyTemplatePointerHash1    425        8069        BasileStarynkevitch       468        8496         -------------------------------- sizeof(T)       = 256 sizeof(T*)      = 4 insertion order = random  dataset size    =    50000 elements name                      insert     access     std::hash                 462        7956        reinterpret_cast          456        8046        djb2                      490        9002        djb2_mod                  483        8905        sdbm                      482        9116        yet_another_lc            492        8982        murmur2                   492        9120        murmur3                   492        9276        simple_xorshift           477        8761        double_xorshift           477        8903        right_shift_2             458        8116        right_shift_3             459        8124        right_shift_4             462        8281        right_shift_5             463        8370        right_shift_8             458        8069        right_shift_12            662        16244       MyTemplatePointerHash1    459        8091        BasileStarynkevitch       472        8476         -------------------------------- sizeof(T)       = 256 sizeof(T*)      = 4 insertion order = sequential  dataset size    =  1000000 elements name                      insert     access     std::hash                 9756       34368       reinterpret_cast          9718       34897       djb2                      10935      36894       djb2_mod                  10820      36788       sdbm                      11084      37857       yet_another_lc            11125      37996       murmur2                   16522      39078       murmur3                   16461      39314       simple_xorshift           15982      38722       double_xorshift           16151      38868       right_shift_2             9611       34997       right_shift_3             9571       35006       right_shift_4             9135       34750       right_shift_5             8978       32878       right_shift_8             8688       30276       right_shift_12            10591      35827       MyTemplatePointerHash1    8721       30265       BasileStarynkevitch       15524      38315        -------------------------------- sizeof(T)       = 256 sizeof(T*)      = 4 insertion order = random  dataset size    =  1000000 elements name                      insert     access     std::hash                 14169      36078       reinterpret_cast          14096      36637       djb2                      15373      37492       djb2_mod                  15279      37438       sdbm                      15531      38247       yet_another_lc            15924      38779       murmur2                   16524      39109       murmur3                   16422      39280       simple_xorshift           16119      38735       double_xorshift           16136      38875       right_shift_2             14319      36692       right_shift_3             14311      36776       right_shift_4             13932      35682       right_shift_5             12736      34530       right_shift_8             9221       30663       right_shift_12            15506      98465       MyTemplatePointerHash1    9268       30697       BasileStarynkevitch       15952      38349        -------------------------------- sizeof(T)       = 1024 sizeof(T*)      = 4 insertion order = sequential  dataset size    =    50000 elements name                      insert     access     std::hash                 421        7863        reinterpret_cast          419        7953        djb2                      457        8983        djb2_mod                  455        8927        sdbm                      445        8609        yet_another_lc            446        8892        murmur2                   492        9090        murmur3                   507        9294        simple_xorshift           467        8687        double_xorshift           472        8872        right_shift_2             432        8009        right_shift_3             432        8014        right_shift_4             435        7998        right_shift_5             442        8099        right_shift_8             432        7914        right_shift_12            462        8911        MyTemplatePointerHash1    426        7744        BasileStarynkevitch       467        8417         -------------------------------- sizeof(T)       = 1024 sizeof(T*)      = 4 insertion order = random  dataset size    =    50000 elements name                      insert     access     std::hash                 452        7948        reinterpret_cast          456        8018        djb2                      489        9037        djb2_mod                  490        8992        sdbm                      477        8795        yet_another_lc            491        9179        murmur2                   502        9078        murmur3                   507        9273        simple_xorshift           473        8671        double_xorshift           480        8873        right_shift_2             470        8105        right_shift_3             470        8100        right_shift_4             476        8333        right_shift_5             468        8065        right_shift_8             469        8094        right_shift_12            524        10216       MyTemplatePointerHash1    451        7826        BasileStarynkevitch       472        8419         -------------------------------- sizeof(T)       = 1024 sizeof(T*)      = 4 insertion order = sequential  dataset size    =  1000000 elements name                      insert     access     std::hash                 10910      38432       reinterpret_cast          10892      38994       djb2                      10499      38985       djb2_mod                  10507      38983       sdbm                      11318      37450       yet_another_lc            11740      38260       murmur2                   16960      39544       murmur3                   16816      39647       simple_xorshift           16096      39021       double_xorshift           16419      39183       right_shift_2             10219      38909       right_shift_3             10012      39036       right_shift_4             10642      40284       right_shift_5             10116      38678       right_shift_8             9083       32914       right_shift_12            9376       31586       MyTemplatePointerHash1    8777       30129       BasileStarynkevitch       16036      38913        -------------------------------- sizeof(T)       = 1024 sizeof(T*)      = 4 insertion order = random  dataset size    =  1000000 elements name                      insert     access     std::hash                 16304      38695       reinterpret_cast          16241      39641       djb2                      16377      39533       djb2_mod                  16428      39526       sdbm                      15402      37811       yet_another_lc            16180      38730       murmur2                   16679      39355       murmur3                   16792      39586       simple_xorshift           16196      38965       double_xorshift           16371      39127       right_shift_2             16445      39263       right_shift_3             16598      39421       right_shift_4             16378      39839       right_shift_5             15517      38442       right_shift_8             11589      33547       right_shift_12            11992      49041       MyTemplatePointerHash1    9052       30222       BasileStarynkevitch       16163      38829       
like image 132
Damon Avatar answered Oct 13 '22 08:10

Damon


After letting this question lay for a while, I'll post my best hash function for pointers so far:

template<typename Tval> struct MyTemplatePointerHash1 {     size_t operator()(const Tval* val) const {         static const size_t shift = (size_t)log2(1 + sizeof(Tval));         return (size_t)(val) >> shift;     } }; 

It's high performing for various block sizes.
If someone has a better function, I'll change the accepted answer.

like image 32
egur Avatar answered Oct 13 '22 06:10

egur