My question is related to multithreading lock-free synchronization. I wanted to know the following:
What are general approaches to achieve this? I read somewhere about LockFreePrimitives like CompareAndExchange (CAS) or DoubleCompareAndExchange (DCA) but no explanation for those were given? Any approaches to MINIMIZE use of locks?
How does Java/.NET achieve their concurrent containers? Do they use locks or lock-free synch?
Thanks in advance.
Its a software mechanism implemented in user mode, i.e. no support required from the Operating System. Its a busy waiting solution (keeps the CPU busy even when its technically waiting). It can be used for more than two processes.
A lock-free data structure increases the amount of time spent in parallel execution rather than serial execution, improving performance on a multi-core processor, because access to the shared data structure does not need to be serialized to stay coherent.
Lock-free queue is a queue applying to concurrency but without locking. When using lock-free queue, slow or stopped processes do not prevent other processes from accessing data in it.
Intuitively, lock-free means that some process is always guaranteed to make progress by completing its operations within a finite number of system steps, while wait-free means that each process completes its operations within a finite number of its own steps.
Here are some general approaches that can minimize the use of locks, assuming your algorithm has some particular exploitable features:
When updating a single numeric variable, you can use non-blocking primitives such as CAS, atomic_increment, etc. They are usually much faster that a classic blocking critical section (lock, mutex).
When a data structure is read by multiple threads, but only written by one or few threads, an obvious solution would be a read-write lock, instead of a full lock.
Try to exploit fine grain locking. For example, instead of locking an entire data structure with a single lock, see if you can use multiple different locks to protect distinct sections of the data structure.
If you're relying on the implicit memory fence effect of locks to ensure visibility of a single variable across threads, just use volatile
1, if available.
Sometimes, using a conditional variable (and associated lock) is too slow in practice. In this case, a volatile
busy spin is much more efficient.
More good advice on this topic here: http://software.intel.com/en-us/articles/intel-guide-for-developing-multithreaded-applications/
A nice read in another SO question: Lock-free multi-threading is for real threading experts (don't be scared by the title).
And a recently discussed lock-free Java implementation of atomic_decrement: Starvation in non-blocking approaches
1 The use of volatile
here applies to languages such as Java where volatile
has defined semantics in the memory model, but not to C or C++ where volatile
preceded the introduction of the cross-thread memory model and doesn't integrate with it. Similar constructs are available in those languages, such as the various std::memory_order
specifiers in C++.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With