Maybe I'm misunderstanding about <code>std::mutex::try_lock</code>: This function is allowed to fail spuriously and return false even if the mutex is not currently locked by any other thread. This means that if no one thread has a lock on that <code>mutex</code>, when I try a <code>try_lock</code> it could return <code>false</code>? For what purpose? Isn't the function of <code>try_lock</code> return <code>false</code> if its locked OR <code>true</code> if nobody lock it? Not really sure if my non-native english is fooling me...

<blockquote> This means that if no one thread has a lock of that mutex, when I try a try_lock, it could return false? </blockquote> Yes, that's exactly what it says. <blockquote> Isn't the function of try_lock return false if its locked OR true if nobody lock it? </blockquote> No, the function of <code>try_lock</code> is to try to lock the mutex. However, there is more than one way it can fail: <ol> <li>the mutex is already locked elsewhere (this is the one you're thinking of)</li> <li>some platform-specific feature interrupts or prevents the locking attempt, and control is returned to the caller who can decide whether to retry.</li> </ol> The common case on POSIX-ish platforms, and inherited from POSIX threads, is that a signal is delivered to (and handled by a signal handler in) the current thread, interrupting the lock attempt. There may be other platform-specific reasons on other platforms, but the behaviour is the same.

Based on your comments, I would write (quoting your words): <pre class="prettyprint"><code>std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex ... if (lock.try_lock()) { ... // "DO something if nobody has a lock" } else { ... // "GO AHEAD" } </code></pre> Note that <code>lock.try_lock()</code> effectively calls <code>m.try_lock()</code>, therefore it is prone to spurious fail as well. But I wouldn't care much about this issue. IMO, in practice, spurious fails/wakeups are quite rare (as Useless pointed out, on Linux, they can happen when a signal is delivered). More about spurious issues, see e.g.: https://en.wikipedia.org/wiki/Spurious_wakeup or Why does pthread_cond_wait have spurious wakeups?. UPDATE If you really want to eliminate spurious fail of <code>try_lock</code>, you can use some atomic flag such as: <pre class="prettyprint"><code>// shared by threads: std::mutex m; std::atomic<bool> flag{false}; // within threads: std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex ... while (true) { lock.try_lock(); if (lock.owns_lock()) { flag = true; ... // "DO something if nobody has a lock" flag = false; break; } else if (flag == true) { ... // "GO AHEAD" break; } } </code></pre> It may be possibly rewritten to better form, I didn't check. Also, note that <code>flag</code> is not automatically unset via RAII, some scope guard may be useful here. UPDATE 2 If you do not need also the blocking functionality of <code>mutex</code>, use <code>std::atomic_flag</code>: <pre class="prettyprint"><code>std::atomic_flag lock = ATOMIC_FLAG_INIT; // within threads: if (lock.test_and_set()) { ... // "DO something if nobody has a lock" lock.clear(); } else { ... // "GO AHEAD" } </code></pre> Just, again, clearing the flag would be better via some RAII mechanism.

In paper Foundations of the C++ Concurrency Memory Model Section 3, there is already a clear explanation for why the standard allows spurious failures of <code>try_lock</code>. In short, it is specified to make the semantics of <code>try_lock</code> be consistent with the definition of race in C++ memory model.

std::mutex::try_lock spuriously fail?

4 Answers

This means that if no one thread has a lock of that mutex, when I try a try_lock, it could return false?

Yes, that's exactly what it says.

Isn't the function of try_lock return false if its locked OR true if nobody lock it?

No, the function of try_lock is to try to lock the mutex.

However, there is more than one way it can fail:

the mutex is already locked elsewhere (this is the one you're thinking of)
some platform-specific feature interrupts or prevents the locking attempt, and control is returned to the caller who can decide whether to retry.

The common case on POSIX-ish platforms, and inherited from POSIX threads, is that a signal is delivered to (and handled by a signal handler in) the current thread, interrupting the lock attempt.

There may be other platform-specific reasons on other platforms, but the behaviour is the same.

122

answered Oct 05 '22 09:10

Useless

Based on your comments, I would write (quoting your words):

std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex
...
if (lock.try_lock()) {
  ... // "DO something if nobody has a lock"
} else {
  ... // "GO AHEAD"
}

Note that lock.try_lock() effectively calls m.try_lock(), therefore it is prone to spurious fail as well. But I wouldn't care much about this issue. IMO, in practice, spurious fails/wakeups are quite rare (as Useless pointed out, on Linux, they can happen when a signal is delivered).

More about spurious issues, see e.g.: https://en.wikipedia.org/wiki/Spurious_wakeup or Why does pthread_cond_wait have spurious wakeups?.

UPDATE

If you really want to eliminate spurious fail of try_lock, you can use some atomic flag such as:

// shared by threads:
std::mutex m;  
std::atomic<bool> flag{false};

// within threads:
std::unique_lock<std::mutex> lock(m, std::defer_lock); // m being a mutex
...
while (true) {
  lock.try_lock();
  if (lock.owns_lock()) {
    flag = true;
    ... // "DO something if nobody has a lock"    
    flag = false;
    break;
  } else if (flag == true) {
    ... // "GO AHEAD"
    break;
  }
}

It may be possibly rewritten to better form, I didn't check. Also, note that flag is not automatically unset via RAII, some scope guard may be useful here.

UPDATE 2

If you do not need also the blocking functionality of mutex, use std::atomic_flag:

std::atomic_flag lock = ATOMIC_FLAG_INIT;

// within threads:
if (lock.test_and_set()) {
    ... // "DO something if nobody has a lock"    
    lock.clear();
} else {
    ... // "GO AHEAD"
}

Just, again, clearing the flag would be better via some RAII mechanism.

answered Oct 05 '22 09:10

Daniel Langr

Unlike was said there, I don't think there is any reason for a try_lock function to failed due to OS-related reasons: such operation is non-blocking, so signals cannot really interrupt it. Most likely it has everything to do with how this function is implemented on CPU-level. After all, uncontested case is usually the most interesting one for a mutex.

Mutex locking usually requires some form of atomic compare exchange operation. C++11 and C11 introduce atomic_compare_exchange_strong and atomic_compare_exchange_weak. The latter is allowed to fail spuriously.

By allowing try_lock to fail spuriously, implementations are allowed to use atomic_compare_exchange_weak to maximize performance and minimize code size.

For example on ARM64 atomic operations are usually implemented using exclusive-load (LDXR) and exclusive-store (STRX) instructions. LDXR fires up "monitor" hardware which starts tracking all accesses to a memory region. STRX only performs the store if no accesses to that region were made between LDXR and STRX instructions. Thus the whole sequence can fail spuriously if another thread accesses that memory region or if there was an IRQ in between the two.

In practice, code generate for try_lock implemented using weak guarantee is not very different from the one implemented using strong guarantee.

bool mutex_trylock_weak(atomic_int *mtx)
{
    int old = 0;
    return atomic_compare_exchange_weak(mtx, &old, 1);
}

bool mutex_trylock_strong(atomic_int *mtx)
{
    int old = 0;
    return atomic_compare_exchange_strong(mtx, &old, 1);
}

Take a look at generated assembly for ARM64:

mutex_trylock_weak:
  sub sp, sp, #16
  mov w1, 0
  str wzr, [sp, 12]
  ldaxr w3, [x0]      ; exclusive load (acquire)
  cmp w3, w1
  bne .L3
  mov w2, 1
  stlxr w4, w2, [x0]  ; exclusive store (release)
  cmp w4, 0           ; the only difference is here
.L3:
  cset w0, eq
  add sp, sp, 16
  ret

mutex_trylock_strong:
  sub sp, sp, #16
  mov w1, 0
  mov w2, 1
  str wzr, [sp, 12]
.L8:
  ldaxr w3, [x0]      ; exclusive load (acquire)
  cmp w3, w1
  bne .L9
  stlxr w4, w2, [x0]  ; exclusive store (release)
  cbnz w4, .L8        ; the only difference is here
.L9:
  cset w0, eq
  add sp, sp, 16
  ret

The only difference is that "weak" version eliminates conditional backward branch cbnz w4, .L8 and replaces it with cmp w4, 0. Backward condition branches are predicted by CPU as "will-be-taken" in the absense of branch prediction information as they are assumed to be part of a loop - such assumption is wrong in this case as most of the time lock will be acquired (low contention is assumed to be the most common case).

Imo this is the only performance difference between those functions. "Strong" version can basically suffer from 100% branch misprediction ratio on single instruction under some workloads.

By the way, ARMv8.1 introduces atomic instructions, so there is no difference between the two, just like on x86_64. Code generated with -march=armv8.1-a flag:

  sub sp, sp, #16
  mov w1, 0
  mov w2, 1
  mov w3, w1
  str wzr, [sp, 12]
  casal w3, w2, [x0]
  cmp w3, w1
  cset w0, eq
  add sp, sp, 16
  ret

Some try_lock functions can fail even when atomic_compare_exchange_strong is used, for example try_lock_shared of shared_mutex might need to increment reader counter and might fail if another reader has entered the lock. "Strong" variant of such function would need to generate a loop and thus can suffer from a similar branch mispredication.

Another minor detail: if mutex is written in C, some compilers (like Clang) might align loop at 16-byte boundary to improve its performance, bloating function body with padding. This is unnecessary if loop almost always runs a single time.

Another reason for spurious failure is a failure to acquire internal mutex lock (if mutex is implemented using a spinlock and some kernel primitive). In theory the same principle could be acquired in the kernel implementation of try_lock, although this does not seems reasonable.

answered Oct 05 '22 09:10

StaceyGirl

In paper Foundations of the C++ Concurrency Memory Model Section 3, there is already a clear explanation for why the standard allows spurious failures of try_lock. In short, it is specified to make the semantics of try_lock be consistent with the definition of race in C++ memory model.

answered Oct 05 '22 07:10

chain ro

Related questions
                            
                                std::addressof as a constant expression in C++17
                            
                                Why is there a size mismatch between structures and unions?
                            
                                Qt - splitting a QString, using several types of whitespace as separators
                            
                                Opencv in memory Mat representation
                            
                                Debugging load time error in C++ SDL2 program compiled with VS2015 on Win10
                            
                                How to debug error "Symbol lookup error: undefined symbol"
                            
                                Empty class in std::tuple
                            
                                Is there a canonical way to allow implicit conversion of a non-const template argument type to a const one?
                            
                                Forward declaration of template friend function
                            
                                Can C++17's deduced `auto` non-type `template` parameters pattern-match templates with explicit non-type parameters?
                            
                                How to force a constexpr function to be evaluated at compile time [duplicate]
                            
                                Deterministic way of saying "promote everything to floating before calculation" in C++
                            
                                Can bit fields be used in C++ classes?
                            
                                Is there any way to know which headers are automatically included in C++
                            
                                How to get array from nlohmann json
                            
                                c++ Is there STL algorithm to check if the range is strictly sorted?
                            
                                C++ Combining Static and Dynamic Polymorphism to Create "Hyper Polymorphism"?
                            
                                Why do const shared_ptr<const T>& and const shared_ptr<T>& show different reference counts?
                            
                                Efficient way to set first N or last N bits of __m256i to 1, the rest to 0
                            
                                Best practice to use execvp in C++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

std::mutex::try_lock spuriously fail?

Tags:

c++

mutex

locking

markzzz

People also ask

4 Answers

Useless

Daniel Langr

StaceyGirl

chain ro

Recent Activity

Donate For Us