The idea behind mutexes is to only allow one thread access to a section of memory at any one time. If one thread locks the mutex, any other lock attempts will block until the first one unlocks. However, how is this implemented? To lock itself, the mutex has to set a bit somewhere that says that it is locked. But what if the second mutex is reading at the same time the first is writing. Worse, what if they both lock the mutex at the same time? The mutex would succumb to the same problem it is meant to prevent.
How do mutexes really work?
One advantage of OS mutexes is that they guarantee fairness: All threads waiting for a lock form a queue, and, when the lock is released, the thread at the head of the queue acquires it. It's 100% deterministic.
Mutex only locks a thread. It does not lock a resource. You can still access it via direct memory manipulation. So in that sense it is not safe (but on the other hand nothing can stop you from direct memory manipulation).
Mutexes can synchronize threads within the same process or in other processes. Mutexes can be used to synchronize threads between processes if the mutexes are allocated in writable memory and shared among the cooperating processes (see mmap(2)), and have been initialized for this task.
When a thread is blocked trying to acquire a mutex, execution resumes when the mutex is released, though the thread might block again if another thread grabs the mutex before it can. There is generally a try-lock operation that grab the mutex if possible, and if not, will return an error.
Low-level atomic operations. These are essentially mutexes implemented in hardware, except you can only perform a very few operations atomically.
Consider the following equivalent pseudocode:
mutex global_mutex; void InterlockedAdd(int& dest, int value) { scoped_lock lock(mutex); dest += value; } int InterlockedRead(int& src) { scoped_lock lock(mutex); return src; } void InterlockedWrite(int& dest, int value) { scoped_lock lock(mutex); dest = value; }
These functions are implemented as instructions by the CPU, and they guarantee consistency between threads to various degrees. The exact semantics depend upon the CPU in question. x86 offers sequential consistency. This means that the operations act as if they were issued sequentially, in some order. This obviously involves blocking a little.
You may accurately surmise that atomic operations can be implemented in terms of mutexes, or vice versa. But usually, atomic ops are provided by hardware, and then mutexes and other synchronization primitives implemented on top of them by the operating system. This is because there are some algorithms that do not require a full mutex and can operate what is known as "locklessly", which means just using atomic operations for some inter-thread consistency.
A simple implementation that has been used in the past is to use a CPU level atomic "lock and exchange" instruction. This is a special instruction that atomically swaps a given value with a value in some memory location.
A thread could acquire such a mutex by trying to swap a 1 value into the memory location. If the value comes back as 0, then the thread would assume that it has the mutex and would continue. Otherwise, if the value returned is 1, then the thread would know that some other thread currently has the mutex. In that case it would wait until trying again.
The above is a highly simplified outline of what might happen in a simple system. Real operating systems are much more complex these days.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With