Atomic shared_ptr for lock-free singly linked list

Tags:

I'm wondering if it is possible to create a lock-free, thread-safe shared pointer for any of the "common" architectures, like x64 or ARMv7 / ARMv8.

In a talk about lock-free programming at cppcon2014, Herb Sutter presented a (partial) implementation of a lock-free singly linked list. The implementation looks quite simple, but it relies on an atomic shared_ptr implementation that doesn't exist in the standard library yet or on using the specialized std::atomic... functions. This is especially important as single push/pop calls potentially invoke multiple atomic loads/stores and compare_exchange operations.

The problem I see (and I think some of the questions in the talk went into the same direction) is that for this to be an actual lock-free data structure, those atomic operations would have to be lock-free themselves. I don't know of any standard library implementation for the std::atomic... functions that is lock-free and - at least with a short google / SO search - I also didn't find a suggestion of how to implement a lock-free specialization for std::atomic<std::shared_ptr>.

Now before I'm wasting my time on this I wanted to ask:

Do you know, if it is possible to write an lockfree, atomic shared pointer at all?
Are there already any implementations that I've overlooked and - ideally - are even compatible with what you would expect from a std::atomic<std::shared_ptr>? For the mentioned queue it would especially require a CAS-operation.
If there is no way to implement this on current architectures, do you see any other benefit in Herb's implementation compared to a "normal" linked list that is protected by a lock?

For reference, here is the code from Herb Sutter (might contain typos from me):

template<class T> 
class slist {
    struct Node { T t;  std::shared_ptr<Node> next; };
    std::atomic<std::shared_ptr<Node>> head;        
public:
    class reference{
        std::shared_ptr<Node> p;
    public:
        reference(std::shared_ptr<Node> p_){}
        T& operator*(){ return p->t; }
        T* operator->(){ return &p->t; }
    };
    auto find(T t) const {
        auto p = head.load();
        while (p && p-> != t) {
            p = p - next;
        }
        return reference(move(p));
    }
    void push_front(T t) {
        auto p = std::make_shared<Node>();
        p->t = t;
        p->next = head;
        while (!head.compare_exchange_weak(p->next, p)) {}
    }
    void pop_front() {
        auto p = head.load();
        while (p && !head.compare_exchange_weak(p, p - next)) { ; }
    }
};

Note, that in this implementation, single instances of a shared_ptr can be accessed / modified by multiple different threads. It can be read/copied, reset and even deleted (as part of a node). So this not about whether multiple different shared_ptr objects (that manage the same object) can be used by multiple threads without a race condition - this that is already true for current implementations and required by the standard - but it is about concurrent access to a single pointer instance, which is - for standard shared pointers - no more threadsafe than the same operations on raw pointers would be.

To explain my motivation:
This is mainly an academic question. I've no intention of implementing my own lock free list in production code, but I find the topic interesting and at first glance, Herb's presentation seemed to be a good introduction. However, while thinking about this question and @sehe's comment on my answer, I remembered this talk, had another look at it and realized that it doesn't make much sense to call Herb's implementation lock-free, if it's primitive operations require locks (which they currently do). So I was wondering, whether this is just a limitation of the current implementations or a fundamental flaw in the design.

355

asked Jul 07 '15 18:07

MikeMB

2 Answers

I'm adding this as an answer since it's too long to fit in a comment:

Something to consider. A lock-free shared_ptr is not needed to implement lock-free/wait-free data structures.

The reason Sutter uses shared_ptr in his presentation is because the most complicated part of writing lock-free data structures is not the synchronization, but the memory reclamation: we cannot delete nodes while they're potentially accessed by other threads, so we have to leak them and reclaim later. A lock-free shared_ptr implementation essentially provides "free" memory reclamation and makes examples of lock-free code palatable, especially in a context of a time-limited presentation.

Of course, having a lock-free atomic_shared_ptr as part of the Standard would be a huge help. But it's not the only way of doing memory reclamation for lock-free data structures, there's the naive implementation of maintaining a list of nodes to be deleted at quiescent points in execution (works in low-contention scenarios only), hazard pointers, roll-your-own atomic reference counting using split counts.

As for performance, @mksteve is correct, lock-free code in not guaranteed to outperform lock-based alternatives unless maybe it runs on a highly parallel system offering true concurrency. It's goal is to enable maximum concurrency and because of that what we typically get is threads doing less waiting at the the cost of performing more work.

PS If this is something that interests you, you should consider taking a look at C++ Concurrency in Action by Anthony Williams. It dedicates a whole chapter to writing lock-free/wait-free code, which offers a good starting place, walking through implementations of lock-free stack and queue.

answered Oct 15 '22 00:10

Alla

Do you know, if it is possible to write an lockfree, atomic shared pointer at all?
Are there already any implementations that I've overlooked and - ideally - are even compatible with what you would expect from a std::atomic?

I think the std::atomic_... offers a form of implementation, where the slist would perform special atomic_ queries on the shared_ptr. The problem with this being separated into two classes (std::atomic and std::shared_ptr) is that they each have constraints which need to be adhered to in order to function. The class separation, makes that knowledge of shared constaints impossible.

Within slist, which knows both items, it can help the situation, and thus probably the atomic_... functions will work.

If there is no way to implement this on current architectures, do you see any other benefit in Herb's implementation compared to a "normal" linked list that is protected by a lock?

From Wikipedia : Non blocking algorithm the purpose of the lock free nature, is to guarantee some progress is being made by at least one thread.

This does not give a guarantee of better performance than a locked implementation, but does give a guarantee that deadlocks will not occur.

Imagine T required a lock to perform a copy, this could also have been owned by some operations outside of the list. Then a deadlock would be possible, if it was owned, and a lock based implementation of slist was called.

I think CAS is implemented in the std::compare_exchange_weak, so would be implementation independent.

Current lock free algorithms for complex structures (e.g vector, map) tend to be significantly less efficient than locking algorithms, Dr Dobbs : lock-free data structures but the benefit offered (improved thread performance) would improve significantly the performance of computers, which tend to have large numbers of idle cpus.

Further research into the algorithms may identify new instructions which could be implemented in the CPUs of the future, to give us wait-free performance and improved utilization of computing resources.

answered Oct 15 '22 02:10

mksteve

Related questions
                            
                                Making Doxygen read double-slash C++ comments as markup
                            
                                Initializing mutually-referencing objects
                            
                                C++11: Is it safe to remove individual elements from std::unordered_map while iterating?
                            
                                How to modify C++ code from user-input
                            
                                Why are std::vector and std::string's comparison operators defined as template functions?
                            
                                NVIDIA CUDA Video Encoder (NVCUVENC) input from device texture array
                            
                                Is there a way to mock QSqlQuery?
                            
                                Why will regex "([a-z])((?!\\1)[a-z])" match string "aa" in C++11?
                            
                                Ignore [clang-diagnostic-error] clang-tidy caused by 3rd party headers
                            
                                Is it legal to use an operator in a fold expression that was introduced by a using declaration?
                            
                                Why does a consteval function allow undefined behavior?
                            
                                Is there a summary of visual studio 2008 runtime versions?
                            
                                Pointer to aligned memory
                            
                                VS2015: How to hide the 'References' folder in solution explorer?
                            
                                Visual Studio 2015 -- CPU Usage Profiling Not Working?
                            
                                Finding anonymous enums with libclang
                            
                                Why move capture is not supported in C++ lambda?
                            
                                What is the equivalent of cast for concepts?
                            
                                Why first element is destroyed?
                            
                                How to measure the call overhead of functions?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Atomic shared_ptr for lock-free singly linked list

Tags:

c++

data-structures

shared-ptr

stdatomic

lock-free

MikeMB

People also ask

2 Answers

Alla

mksteve

Recent Activity

Donate For Us