Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to adapt this lock-free 32-bit hash-table algorithm for 64-bit keys?

Problem & context

This post describes a lock-free 32-bit hash-table algorithm. The core of the algorithm is a lock-free linear search, which is used to insert a key-val pair in a (logical) list:

enter image description here

Here is the code provided:

void ArrayOfItems::SetItem(uint32_t key, uint32_t value)
{
    for (uint32_t idx = 0;; idx++)
    {
        uint32_t prevKey = mint_compare_exchange_strong_32_relaxed(&m_entries[idx].key, 0, key);
        if ((prevKey == 0) || (prevKey == key))
        {
            mint_store_32_relaxed(&m_entries[idx].value, value);
            return;
        }
    }
}

For a specific problem, I need to insert random key-val pairs in a table. As such, I need at least 64-bit keys, because with 32-bits there is 50% chance of collision after 65536 insertions, which is too low. Unfortunately, I do not have 64-bit cmpxchg as a primitive.

Question

Is it possible to generalize the hash-table above to 64-bit keys, using only 32-bit cmpxchg?

like image 564
MaiaVictor Avatar asked May 30 '18 17:05

MaiaVictor


1 Answers

I'm not sure from the question whether you are still wanting to retain the lock-free characteristic, or just want to get 64-bit key/value storage up & running. (?)

There is a 64-bit MurmurHash3 posted here by @kol: hashing a small number to a random looking 64 bit integer

Clearly, if you introduced a second array to assert key-location ownership, and respected that for value storage, you could then read and CAS 64-bit values in two steps, then release ownership. That doesn't get you lock-free, of course.

--------------- Edit: ---------------
The author has at least two videos on his hash table, both from 2007:

Advanced Topics in Programming Languages: A Lock-Free Hash Table https://www.youtube.com/watch?v=HJ-719EGIts

A Fast Wait-Free Hash Table
https://youtu.be/WYXgtXWejRM

He relates his program flow to a finite-state machine. Disregarding the issue of growing the table, there are four states a location can be in before a potential mutation is applied to that location. Key/Value = [nil/nil], [X/nil], [X/X], [nil/X].

Reading the state in preparation for mutation does not guarantee, under concurrency, that the state remains unchanged at the time the mutation is applied.

With 32 bit operations, we have the following logic:
- If the read key = the desired key, the value can be written to the location.
- If the read key = nil, and the value = non-nil, then another thread is mutating the location.
- if the read key = nil, and the value = nil, then the location can be written via a successful key CAS.

If you want to use 32 bit atomic operations to store 64 bit data without locking, then the state diagram is increased in size, with more failure states, eg:
- You could read a half-created Key.
- A CAS update of one half of a Value entry may be stomped by another thread, failing on the second CAS.
- A CAS creation of one half of a Key entry may be stomped by another thread, failing on the second CAS.
- The 32 bit representation of array initialiser 'nil' should be excluded from occurring as either half of a 64 bit key or value

The process of increasing the table size adds some more states to consider, also.

like image 63
Paul McGee Avatar answered Sep 29 '22 21:09

Paul McGee