Say, I have two threads <code>A</code> and <code>B</code> writing to a global Boolean variables <code>fA</code> and <code>fB</code> respectively which are initially set to <code>false</code> and are protected by <code>std::mutex</code> objects <code>mA</code> and <code>mB</code> respectively: <pre class="prettyprint"><code>// Thread A mA.lock(); assert( fA == false ); fA = true; mA.unlock(); // Thread B mB.lock() assert( fB == false ); fB = true; mB.unlock() </code></pre> Is it possible to observe the modifications on <code>fA</code> and <code>fB</code> in different orders in different threads <code>C</code> and <code>D</code>? In other words, can the following program <pre class="prettyprint"><code>#include <atomic> #include <cassert> #include <iostream> #include <mutex> #include <thread> using namespace std; mutex mA, mB, coutMutex; bool fA = false, fB = false; int main() { thread A{ []{ lock_guard<mutex> lock{mA}; fA = true; } }; thread B{ [] { lock_guard<mutex> lock{mB}; fB = true; } }; thread C{ [] { // reads fA, then fB mA.lock(); const auto _1 = fA; mA.unlock(); mB.lock(); const auto _2 = fB; mB.unlock(); lock_guard<mutex> lock{coutMutex}; cout << "Thread C: fA = " << _1 << ", fB = " << _2 << endl; } }; thread D{ [] { // reads fB, then fA (i. e. vice versa) mB.lock(); const auto _3 = fB; mB.unlock(); mA.lock(); const auto _4 = fA; mA.unlock(); lock_guard<mutex> lock{coutMutex}; cout << "Thread D: fA = " << _4 << ", fB = " << _3 << endl; } }; A.join(); B.join(); C.join(); D.join(); } </code></pre> legally print <pre class="prettyprint"><code>Thread C: fA = 1, fB = 0 Thread D: fA = 0, fB = 1 </code></pre> according to the C++ standard? Note: A spin-lock can be implemented using <code>std::atomic<bool></code> variables using either sequential consistent memory order or acquire/release memory order. So the question is whether an <code>std::mutex</code> behaves like a sequentially consistent spin-lock or an acquire/release memory order spin-lock.

<strike>Yes, that is allowed</strike> That output isn't possible, but <code>std::mutex</code> is not necessarily sequentially consistent. Acquire/release is enough to rule out that behaviour. <code>std::mutex</code> is not defined in the standard to be sequentially consistent, only that <blockquote> <h3>30.4.1.2 Mutex types [thread.mutex.requirements.mutex]</h3> 11 Synchronization: Prior unlock() operations on the same object shall synchronize with (1.10) this operation [lock()]. </blockquote> Synchronize-with seems to be defined in the same was as <code>std::memory_order::release/acquire</code> (see this question). As far as I can see, an acquire/release spinlock would satisfy the standards for std::mutex. Big edit: However, I don't think that means what you think (or what I thought). The output is still not possible, since acquire/release semantics are enough to rule it out. This is a kind of subtle point that is better explained here. It seems obviously impossible at first but I think it's right to be cautious with stuff like this. From the standard, unlock() synchronises with lock(). That means anything that happens before unlock() is visible after lock(). Happens before (henceforth ->) is a slightly weird relation explained better in the above link, but because there's mutexes around everything in this example, everything works like you expect, i.e. <code>const auto _1 = fA;</code> happens before <code>const auto _2 = fB;</code>, and any changes visible to a thread when it <code>unlock()</code>s the mutex are visible to the next thread that <code>lock()</code>s the mutex. Also it has some expected properties, e.g. if X happens before Y and Y happens before Z, then X -> Z, also if X happens before Y then Y doesn't happen before X. From here it's not hard to see the contradiction that seems intuitively correct. In short, there's a well defined order of operations for each mutex - e.g. for mutex A, threads A, C, D hold the locks in some sequence. For thread D to print fA=0, it must lock mA before thread A, vice versa for thread C. So the lock sequence for mA is D(mA) -> A(mA) -> C(mA). For mutex B the sequence must be C(mB) -> B(mB) -> D(mB). But from the program we know C(mA) -> C(mB), so that lets us put both together to get D(mA) -> A(mA) -> C(mA) -> C(mB) -> B(mB) -> D(mB), which means D(mA) -> D(mB). But the code also gives us D(mB) -> D(mA), which is a contradiction, meaning your observed output is not possible. This outcome is no different for an acquire/release spinlock, I think everyone was confusing regular acquire/release memory access on a variable with access to a variable protected by a spinlock. The difference is that with a spinlock, the reading threads also perform a compare/exchange and a release write, which is a completely different scenario to a single release write and acquire read. If you used a sequentially consistent spinlock then this wouldn't affect the output. The only difference is that you could always categorically answer questions like "mutex A was locked before mutex B" from a separate thread that didn't acquire either lock. But for this example and most others, that kind of statement isn't useful, hence acquire/release being the standard.

Is std::mutex sequentially consistent?

Tags:

c++

multithreading

c++11

mutex

memory-barriers

Say, I have two threads A and B writing to a global Boolean variables fA and fB respectively which are initially set to false and are protected by std::mutex objects mA and mB respectively:

// Thread A
mA.lock();
assert( fA == false );
fA = true;
mA.unlock();

// Thread B
mB.lock()
assert( fB == false );
fB = true;
mB.unlock()

Is it possible to observe the modifications on fA and fB in different orders in different threads C and D? In other words, can the following program

#include <atomic>
#include <cassert>
#include <iostream>
#include <mutex>
#include <thread>
using namespace std;

mutex mA, mB, coutMutex;
bool fA = false, fB = false;

int main()
{
    thread A{ []{
            lock_guard<mutex> lock{mA};
            fA = true;
        } };
    thread B{ [] {
            lock_guard<mutex> lock{mB};
            fB = true;
        } };
    thread C{ [] { // reads fA, then fB
            mA.lock();
            const auto _1 = fA;
            mA.unlock();
            mB.lock();
            const auto _2 = fB;
            mB.unlock();
            lock_guard<mutex> lock{coutMutex};
            cout << "Thread C: fA = " << _1 << ", fB = " << _2 << endl;
        } };
    thread D{ [] { // reads fB, then fA (i. e. vice versa)
            mB.lock();
            const auto _3 = fB;
            mB.unlock();
            mA.lock();
            const auto _4 = fA;
            mA.unlock();
            lock_guard<mutex> lock{coutMutex};
            cout << "Thread D: fA = " << _4 << ", fB = " << _3 << endl;
        } };
    A.join(); B.join(); C.join(); D.join();
}

legally print

Thread C: fA = 1, fB = 0
Thread D: fA = 0, fB = 1

according to the C++ standard?

Note: A spin-lock can be implemented using std::atomic<bool> variables using either sequential consistent memory order or acquire/release memory order. So the question is whether an std::mutex behaves like a sequentially consistent spin-lock or an acquire/release memory order spin-lock.

695

asked Jan 25 '17 09:01

Ralph Tandetzky

1 Answers

~~Yes, that is allowed~~ That output isn't possible, but std::mutex is not necessarily sequentially consistent. Acquire/release is enough to rule out that behaviour.

std::mutex is not defined in the standard to be sequentially consistent, only that

30.4.1.2 Mutex types [thread.mutex.requirements.mutex]

11 Synchronization: Prior unlock() operations on the same object shall synchronize with (1.10) this operation [lock()].

Synchronize-with seems to be defined in the same was as std::memory_order::release/acquire (see this question).
As far as I can see, an acquire/release spinlock would satisfy the standards for std::mutex.

Big edit:

However, I don't think that means what you think (or what I thought). The output is still not possible, since acquire/release semantics are enough to rule it out. This is a kind of subtle point that is better explained here. It seems obviously impossible at first but I think it's right to be cautious with stuff like this.

From the standard, unlock() synchronises with lock(). That means anything that happens before unlock() is visible after lock(). Happens before (henceforth ->) is a slightly weird relation explained better in the above link, but because there's mutexes around everything in this example, everything works like you expect, i.e. const auto _1 = fA; happens before const auto _2 = fB;, and any changes visible to a thread when it unlock()s the mutex are visible to the next thread that lock()s the mutex. Also it has some expected properties, e.g. if X happens before Y and Y happens before Z, then X -> Z, also if X happens before Y then Y doesn't happen before X.

From here it's not hard to see the contradiction that seems intuitively correct.

In short, there's a well defined order of operations for each mutex - e.g. for mutex A, threads A, C, D hold the locks in some sequence. For thread D to print fA=0, it must lock mA before thread A, vice versa for thread C. So the lock sequence for mA is D(mA) -> A(mA) -> C(mA).

For mutex B the sequence must be C(mB) -> B(mB) -> D(mB).

But from the program we know C(mA) -> C(mB), so that lets us put both together to get D(mA) -> A(mA) -> C(mA) -> C(mB) -> B(mB) -> D(mB), which means D(mA) -> D(mB). But the code also gives us D(mB) -> D(mA), which is a contradiction, meaning your observed output is not possible.

This outcome is no different for an acquire/release spinlock, I think everyone was confusing regular acquire/release memory access on a variable with access to a variable protected by a spinlock. The difference is that with a spinlock, the reading threads also perform a compare/exchange and a release write, which is a completely different scenario to a single release write and acquire read.

If you used a sequentially consistent spinlock then this wouldn't affect the output. The only difference is that you could always categorically answer questions like "mutex A was locked before mutex B" from a separate thread that didn't acquire either lock. But for this example and most others, that kind of statement isn't useful, hence acquire/release being the standard.

174

answered Oct 20 '22 16:10

Joseph Ireland

Related questions
                            
                                C2694 on destructor when base class' member's destructor has non-empty noexcept specifier and a body
                            
                                How to enable __fp16 type on gcc for x86_64
                            
                                Why does the C++ standard handle file seeking the way it does?
                            
                                Why do static inline data members not end up in a .bss section on Macos?
                            
                                Tool to create an amalgamation/combine all source files of a library into one for C/C++?
                            
                                Is there a C++ implementation for vEB Trees? [closed]
                            
                                C++ lzma compression and decompression of large stream by parts
                            
                                Can a string literal be subscripted in a constant expression?
                            
                                How to determine which CPU a thread runs on?
                            
                                unresolved external symbol for __declspec(dllimport) when using dll to export class
                            
                                std::unordered_set<T>::insert(T&&): is argument moved if it exists
                            
                                Eclipse CDT multithreaded debugging not-optimal - how does one run threads exclusively?
                            
                                How to integrate a library that uses expression templates?
                            
                                Linux AIO: Poor Scaling
                            
                                What happens in this code? (Executing a char buffer)
                            
                                How to profile from the command line on Mac OS X?
                            
                                Odd behavior passing static constexpr members without definitions by value
                            
                                Relationship between 'x' and L'x' and widen('x')
                            
                                Segfault with asio standalone when classes in separate files
                            
                                Constructors, templates and non-type parameters

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With