If I have a single int which I want to write to from one thread and read from on another, I need to use std::atomic
, to ensure that its value is consistent across cores, regardless of whether or not the instructions that read from and write to it are conceptually atomic. If I don't, it may be that the reading core has an old value in its cache, and will not see the new value. This makes sense to me.
If I have some complex data type that cannot be read/written to atomically, I need to guard access to it using some synchronisation primitive, such as std::mutex
. This will prevent the object getting into (or being read from) an inconsistent state. This makes sense to me.
What doesn't make sense to me is how mutexes help with the caching problem that atomics solve. They seem to exist solely to prevent concurrent access to some resource, but not to propagate any values contained within that resource to other cores' caches. Is there some part of their semantics I've missed which deals with this?
First, we need to declare a lock variable of type Mutex. Second, before entering the critical section that is protected by that lock, execute the Lock() method of that lock. Third, before exiting the critical section, execute the Unlock() method of the lock. In the following, we have a lock MyLock and use MyLock.
Mutex or Mutual Exclusion Object is used to give access to a resource to only one process at a time. The mutex object allows all the processes to use the same resource but at a time, only one process is allowed to use the resource. Mutex uses the lock-based technique to handle the critical section problem.
Mutex is an abbreviation for mutual exclusion, as in, a mutex allows only one thread to access some data at any given time.
Summing up, in general atomic operations are faster if contention between threads is sufficiently low.
The right answer to this is magic pixies - e.g. It Just Works. The implementation of std::atomic for each platform must do the right thing.
The right thing is a combination of 3 parts.
Firstly, the compiler needs to know that it can't move instructions across boundaries [in fact it can in some cases, but assume that it doesn't].
Secondly, the cache/memory subsystem needs to know - this is generally done using memory barriers, although x86/x64 generally have such strong memory guarantees that this isn't necessary in the vast majority of cases (which is a big shame as its nice for wrong code to actually go wrong).
Finally the CPU needs to know it cannot reorder instructions. Modern CPUs are massively aggressive at reordering operations and making sure in the single threaded case that this is unnoticeable. They may need more hints that this cannot happen in certain places.
For most CPUs part 2 and 3 come down to the same thing - a memory barrier implies both. Part 1 is totally inside the compiler, and is down to the compiler writers to get right.
See Herb Sutters talk 'Atomic Weapons' for a lot more interesting info.
The consistency across cores is ensured by memory barriers (which also prevents instructions reordering). When you use std::atomic
, not only do you access the data atomically, but the compiler (and library) also insert the relevant memory barriers.
Mutexes work the same way: the mutex implementations (eg. pthreads or WinAPI or what not) internally also insert memory barriers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With