I'm studying the difference between mutex
and atomic
in C++11.
As my understanding, mutex
is a kind of lock-mechanism, which is implemented based on the OS/kernel. For example, Linux offers a mechanism, which is futex
. With the help of futex
, we could implement mutex
and semaphore
. Furthermore, I've known that futex
is implemented by the low-level atomic operation, such as CompareAndSet
, CompareAndSwap
.
For std::atomic
, I've known that it is implemented based on the memory model, which is introduced by C++11. However, I don't know how the memory model is implemented at the low level. If it is also implemented by the atomic operation like CompareAndSet
, what is the difference between std::atomic
and mutex
?
In a word, if std::atomic::is_lock_free
gives me a false
, well, I'm gonna say that std::atomic
is the same with mutex
. But if it gives me a true
, how is it implemented at low level?
If atomic operations are lock_free
, they're probably implemented the same way the components of a mutex are implemented. After all, to lock a mutex, you really want some kind of atomic operation to ensure that one and only one thread locks the mutex.
The difference is, atomic operations that are lock free don't have a "locked" state. Let's compare two possible ways to do an atomic increment of a variable:
First, the mutex way. We lock a mutex, we read-increment-write the variable, then we unlock the mutex. If the thread gets interrupted during the read-increment-write, other threads that attempt to perform this same operation will block trying to lock the mutex. (See Where is the lock for a std::atomic? for how this works on some real implementations, for objects too large to be lock_free.)
Second, the atomic way. The CPU "locks" just the cache line holding the variable we want to modify for the duration of a single read-increment-write instruction. (This means the CPU delays responding to MESI requests to invalidate or share the cache line, keeping exclusive access so no other CPU can look at it. MESI cache coherency always requires exclusive ownership of a cache line before a core can modify it so this is cheap if we already owned the line). It is not possible for us to get interrupted during an instruction. Another thread that attempts to access this variable, at worst, has to wait for the cache coherency hardware to sort out who can modify the memory location.
So how do we lock a mutex? Likely we perform an atomic compare-and-swap. So light atomic operations are the primitives from which heavy mutex operations are assembled.
Of course this is all platform-specific. But this is what typical, modern platforms that you are likely to use do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With