Alternatives for locks for synchronisation

Tags:

I'm currently in the process of developing my own little threading library, mainly for learning purposes, and am at the part of the message queue which will involve a lot of synchronisation in various places. Previously I've mainly used locks, mutexes and condition variables a bit which all are variations of the same theme, a lock for a section that should only be used by one thread at a time.

Are there any different solutions to synchronisation than using locks? I've read lock-free synchronization at places, but some consider hiding the locks in containers to be lock-free, which I disagree with. you just don't explicitly use the locks yourself.

848

asked Nov 26 '10 04:11

dutt

2 Answers

Lock-free algorithms typically involve using compare-and-swap (CAS) or similar CPU instructions that update some value in memory not only atomically, but also conditionally and with an indicator of success. That way you can code something like this:

1 do
2 {
3     current_value = the_varibale
4     new_value = ...some expression using current_value...
5 } while(!compare_and_swap(the_variable, current_value, new_value));

compare_and_swap() atomically checks whether the_variable's value is still current_value, and only if that's so will it update the_variable's value to new_value and return true
exact calling syntax will vary with the CPU, and may involve assembly language or system/compiler-provided wrapper functions (use the latter if available - there may be other compiler optimisations or issues that their usage restricts to safe behaviours); generally, check your docs

The significance is that when another thread updates the variable after the read on line 3 but before the CAS on line 5 attempts the update, the compare and swap instruction will fail because the state from which you're updating is not the one you used to calculate the desired target state. Such do/while loops can be said to "spin" rather than lock, as they go round and round the loop until CAS succeeds.

Crucially, your existing threading library can be expected to have a two-stage locking approach for mutex, read-write locks etc. involving:

First stage: spinning using CAS or similar (i.e. spin on { read the current value, if it's not set then cas(current = not set, new = set) }) - which means other threads doing a quick update often won't result in your thread swapping out to wait, and all the relatively time-consuming overheads associated with that.
The second stage is only used if some limit of loop iterations or elapsed time is exceeded: it asks the operating system to queue the thread until it knows (or at least suspects) the lock is free to acquire.

The implication of this is that if you're using a mutex to protect access to a variable, then you are unlikely to do any better by implementing your own CAS-based "mutex" to protect the same variable.

Lock free algorithms come into their own when you are working directly on a variable that's small enough to update directly with the CAS instruction itself. Instead of being...

get a mutex (by spinning on CAS, falling back on slower OS queue)
update variable
release mutex

...they're simplified (and made faster) by simply having the spin on CAS do the variable update directly. Of course, you may find the work to calculate new value from old painful to repeat speculatively, but unless there's a LOT of contention you're not wasting that effort often.

This ability to update only a single location in memory has far-reaching implications, and work-arounds can require some creativity. For example, if you had a container using lock-free algorithms, you may decide to calculate a potential change to an element in the container, but can't sync that with updating a size variable elsewhere in memory. You may need to live without size, or be able to use an approximate size where you do a CAS-spin to increment or decrement the size later, but any given read of size may be slightly wrong. You may need to merge two logically-related data structures - such as a free list and the element-container - to share an index, then bit-pack the core fields for each into the same atomically-sized word at the start of each record. These kinds of data optimisations can be very invasive, and sometimes won't get you the behavioural characteristics you'd like. Mutexes et al are much easier in this regard, and at least you know you won't need a rewrite to mutexes if requirements evolve just that step too far. That said, clever use of a lock-free approach really can be adequate for a lot of needs, and yield a very gratifying performance and scalability improvement.

A core (good) consequence of lock-free algorithms is that one thread can't be holding the mutex then happen to get swapped out by the scheduler, such that other threads can't work until it resumes; rather - with CAS - they can spin safely and efficiently without an OS fallback option.

Things that lock free algorithms can be good for include updating usage/reference counters, modifying pointers to cleanly switch the pointed-to data, free lists, linked lists, marking hash-table buckets used/unused, and load-balancing. Many others of course.

As you say, simply hiding use of mutexes behind some API is not lock free.

answered Nov 15 '22 18:11

Tony Delroy

There are a lot of different approaches to synchronization. There are various variants of message-passing (for example, CSP) or transactional memory.

Both of these may be implemented using locks, but that's an implementation detail.

And then of course, for some purposes, there are lock-free algorithms or data-structures, which make do with just a few atomic instructions (such as compare-and-swap), but this isn't really a general-purpose replacement for locks.

answered Nov 15 '22 18:11

jalf

Related questions
                            
                                Casting to void (not pointer) is allowed, why?
                            
                                Why can't I construct a gsl::span with a brace-enclosed initializer list
                            
                                C++ Why can I initialize a static const char but not a static const double in a class definition?
                            
                                Is there a way to convert std::vector<const T*> to std::vector<T*> without extra allocations?
                            
                                Is DBL_MIN the smallest positive double?
                            
                                Why can't we trivially copy std::function
                            
                                Evaluating stream operator >> as boolean
                            
                                Partial template specialization triggering static_asserts
                            
                                c++ remove noexcept from decltype returned type
                            
                                What is the correct typedef for an opaque C pointer to a C++ class?
                            
                                SFINAE: What is happening here?
                            
                                Why does operator* of rvalue unique_ptr return an lvalue?
                            
                                Multiple definition of inline functions when linking static libs
                            
                                How to rotate Bitmap in windows GDI?
                            
                                How to ignore false positive memory leaks from _CrtDumpMemoryLeaks?
                            
                                How to get this Qt state machine to work?
                            
                                Introduction to C# for C/C++ users
                            
                                varargs(va_list va_start) doesn't work with pass-by-reference parameter [duplicate]
                            
                                c++ friend function - operator overloading istream >>
                            
                                Windows Safe Mode runs simple programs (at least) 3 times faster?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Alternatives for locks for synchronisation

Tags:

c++

multithreading

locking

dutt

People also ask

2 Answers

Tony Delroy

jalf

Recent Activity

Donate For Us