Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spurious unblocking in boost thread

I came across this interesting paragraph in the Boost thread documentation today:

void wait(boost::unique_lock<boost::mutex>& lock)

...

Effects: Atomically call lock.unlock() and blocks the current thread. The thread will unblock when notified by a call to this->notify_one() or this->notify_all(), or spuriously. When the thread is unblocked (for whatever reason), the lock is reacquired by invoking lock.lock() before the call to wait returns. The lock is also reacquired by invoking lock.lock() if the function exits with an exception.

So what I am interested in is the meaning of the word "spuriously". Why would the thread be unblocked for spurious reasons? What can be done to resolve this?

like image 913
1800 INFORMATION Avatar asked Mar 09 '09 08:03

1800 INFORMATION


2 Answers

This article by Anthony Williams is particularly detailed.

Spurious wakes cannot be predicted: they are essentially random from the user's point of view. However, they commonly occur when the thread library cannot reliably ensure that a waiting thread will not miss a notification. Since a missed notification would render the condition variable useless, the thread library wakes the thread from its wait rather than take the risk.

He also points out that you shouldn't use the timed_wait overloads that take a duration, and you should generally use the versions that take a predicate

That's the beginner's bug, and one that's easily overcome with a simple rule: always check your predicate in a loop when waiting with a condition variable. The more insidious bug comes from timed_wait().

This article by Vladimir Prus is also interesting.

But why do we need the while loop, can't we write:

if (!something_happened)
  c.wait(m);

We can't. And the killer reason is that 'wait' can return without any 'notify' call. That's called spurious wakeup and is explicitly allowed by POSIX. Essentially, return from 'wait' only indicates that the shared data might have changed, so that data must be evaluated again.

Okay, so why this is not fixed yet? The first reason is that nobody wants to fix it. Wrapping call to 'wait' in a loop is very desired for several other reasons. But those reasons require explanation, while spurious wakeup is a hammer that can be applied to any first year student without fail.

like image 53
1800 INFORMATION Avatar answered Sep 28 '22 09:09

1800 INFORMATION


This blog post gives a reason for Linux, in terms of the futex system call returning when a signal is delivered to a process. Unfortunately it doesn't explain anything else (and indeed is asking for more information).

The Wikipedia entry on spurious wakeups (which appear to be a posix-wide concept, btw, not limited to boost) may interest you too.

like image 42
Jon Skeet Avatar answered Sep 28 '22 09:09

Jon Skeet