My question refers specifically to why it was designed that way, due to the unnecessary performance implication. When thread T1 has this code: <pre class="prettyprint"><code>cv.acquire() cv.wait() cv.release() </code></pre> and thread T2 has this code: <pre class="prettyprint"><code>cv.acquire() cv.notify() # requires that lock be held cv.release() </code></pre> what happens is that T1 waits and releases the lock, then T2 acquires it, notifies <code>cv</code> which wakes up T1. Now, there is a race-condition between T2's release and T1's reacquiring after returning from <code>wait()</code>. If T1 tries to reacquire first, it will be unnecessarily resuspended until T2's <code>release()</code> is completed. Note: I'm intentionally not using the <code>with</code> statement, to better illustrate the race with explicit calls. This seems like a design flaw. Is there any rationale known for this, or am I missing something?

This is not a definitive answer, but it's supposed to cover the relevant details I've managed to gather about this problem. First, Python's threading implementation is based on Java's. Java's <code>Condition.signal()</code> documentation reads: <blockquote> An implementation may (and typically does) require that the current thread hold the lock associated with this Condition when this method is called. </blockquote> Now, the question was why enforce this behavior in Python in particular. But first I want to cover the pros and cons of each approach. As to why some think it's often a better idea to hold the lock, I found two main arguments: <ol> <li>From the minute a waiter <code>acquire()</code>s the lock—that is, before releasing it on <code>wait()</code>—it is guaranteed to be notified of signals. If the corresponding <code>release()</code> happened prior to signalling, this would allow the sequence(where P=Producer and C=Consumer) <code>P: release(); C: acquire(); P: notify(); C: wait()</code> in which case the <code>wait()</code> corresponding to the <code>acquire()</code> of the same flow would miss the signal. There are cases where this doesn't matter (and could even be considered to be more accurate), but there are cases where that's undesirable. This is one argument.</li> <li>When you <code>notify()</code> outside a lock, this may cause a scheduling priority inversion; that is, a low-priority thread might end up taking priority over a high-priority thread. Consider a work queue with one producer and two consumers (LC=Low-priority consumer and HC=High-priority consumer), where LC is currently executing a work item and HC is blocked in <code>wait()</code>.</li> </ol> The following sequence may occur: <pre class="prettyprint"><code>P LC HC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ execute(item) (in wait()) lock() wq.push(item) release() acquire() item = wq.pop() release(); notify() (wake-up) while (wq.empty()) wait(); </code></pre> Whereas if the <code>notify()</code> happened before <code>release()</code>, LC wouldn't have been able to <code>acquire()</code> before HC had been woken-up. This is where the priority inversion occurred. This is the second argument. The argument in favor of notifying outside of the lock is for high-performance threading, where a thread need not go back to sleep just to wake-up again the very next time-slice it gets—which was already explained how it might happen in my question. <h3>Python's <code>threading</code> Module</h3> In Python, as I said, you must hold the lock while notifying. The irony is that the internal implementation does not allow the underlying OS to avoid priority inversion, because it enforces a FIFO order on the waiters. Of course, the fact that the order of waiters is deterministic could come in handy, but the question remains why enforce such a thing when it could be argued that it would be more precise to differentiate between the lock and the condition variable, for that in some flows that require optimized concurrency and minimal blocking, <code>acquire()</code> should not by itself register a preceding waiting state, but only the <code>wait()</code> call itself. Arguably, Python programmers would not care about performance to this extent anyway—although that still doesn't answer the question of why, when implementing a standard library, one should not allow several standard behaviors to be possible. One thing which remains to be said is that the developers of the <code>threading</code> module might have specifically wanted a FIFO order for some reason, and found that this was somehow the best way of achieving it, and wanted to establish that as a <code>Condition</code> at the expense of the other (probably more prevalent) approaches. For this, they deserve the benefit of the doubt until they might account for it themselves.

There are several reasons which are compelling (when taken together). <h3>1. The notifier needs to take a lock</h3> Pretend that <code>Condition.notifyUnlocked()</code> exists. The standard producer/consumer arrangement requires taking locks on both sides: <pre class="prettyprint"><code>def unlocked(qu,cv): # qu is a thread-safe queue qu.push(make_stuff()) cv.notifyUnlocked() def consume(qu,cv): with cv: while True: # vs. other consumers or spurious wakeups if qu: break cv.wait() x=qu.pop() use_stuff(x) </code></pre> This fails because both the <code>push()</code> and the <code>notifyUnlocked()</code> can intervene between the <code>if qu:</code> and the <code>wait()</code>. Writing either of <pre class="prettyprint"><code>def lockedNotify(qu,cv): qu.push(make_stuff()) with cv: cv.notify() def lockedPush(qu,cv): x=make_stuff() # don't hold the lock here with cv: qu.push(x) cv.notifyUnlocked() </code></pre> works (which is an interesting exercise to demonstrate). The second form has the advantage of removing the requirement that <code>qu</code> be thread-safe, but it costs no more locks to take it around the call to <code>notify()</code> as well. It remains to explain the preference for doing so, especially given that (as you observed) CPython does wake up the notified thread to have it switch to waiting on the mutex (rather than simply moving it to that wait queue). <h3>2. The condition variable itself needs a lock</h3> The <code>Condition</code> has internal data that must be protected in case of concurrent waits/notifications. (Glancing at the CPython implementation, I see the possibility that two unsynchronized <code>notify()</code>s could erroneously target the same waiting thread, which could cause reduced throughput or even deadlock.) It could protect that data with a dedicated lock, of course; since we need a user-visible lock already, using that one avoids additional synchronization costs. <h3>3. Multiple wake conditions can need the lock</h3> (Adapted from a comment on the blog post linked below.) <pre class="prettyprint"><code>def setSignal(box,cv): signal=False with cv: if not box.val: box.val=True signal=True if signal: cv.notifyUnlocked() def waitFor(box,v,cv): v=bool(v) # to use == while True: with cv: if box.val==v: break cv.wait() </code></pre> Suppose <code>box.val</code> is <code>False</code> and thread #1 is waiting in <code>waitFor(box,True,cv)</code>. Thread #2 calls <code>setSignal</code>; when it releases <code>cv</code>, #1 is still blocked on the condition. Thread #3 then calls <code>waitFor(box,False,cv)</code>, finds that <code>box.val</code> is <code>True</code>, and waits. Then #2 calls <code>notify()</code>, waking #3, which is still unsatisfied and blocks again. Now #1 and #3 are both waiting, despite the fact that one of them must have its condition satisfied. <pre class="prettyprint"><code>def setTrue(box,cv): with cv: if not box.val: box.val=True cv.notify() </code></pre> Now that situation cannot arise: either #3 arrives before the update and never waits, or it arrives during or after the update and has not yet waited, guaranteeing that the notification goes to #1, which returns from <code>waitFor</code>. <h3>4. The hardware might need a lock</h3> With wait morphing and no GIL (in some alternate or future implementation of Python), the memory ordering (cf. Java's rules) imposed by the lock-release after <code>notify()</code> and the lock-acquire on return from <code>wait()</code> might be the only guarantee of the notifying thread's updates being visible to the waiting thread. <h3>5. Real-time systems might need it</h3> Immediately after the POSIX text you quoted we find: <blockquote> however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal(). </blockquote> One blog post contains further discussion of the rationale and history of this recommendation (as well as of some of the other issues here).

Why does Python threading.Condition() notify() require a lock?

Tags:

python

python-3.x

multithreading

race-condition

condition-variable

My question refers specifically to why it was designed that way, due to the unnecessary performance implication.

When thread T1 has this code:

cv.acquire() cv.wait() cv.release()

and thread T2 has this code:

cv.acquire() cv.notify()  # requires that lock be held cv.release()

what happens is that T1 waits and releases the lock, then T2 acquires it, notifies cv which wakes up T1. Now, there is a race-condition between T2's release and T1's reacquiring after returning from wait(). If T1 tries to reacquire first, it will be unnecessarily resuspended until T2's release() is completed.

Note: I'm intentionally not using the with statement, to better illustrate the race with explicit calls.

This seems like a design flaw. Is there any rationale known for this, or am I missing something?

219

asked Sep 06 '17 13:09

Yam Marcovic

2 Answers

This is not a definitive answer, but it's supposed to cover the relevant details I've managed to gather about this problem.

First, Python's threading implementation is based on Java's. Java's Condition.signal() documentation reads:

An implementation may (and typically does) require that the current thread hold the lock associated with this Condition when this method is called.

Now, the question was why enforce this behavior in Python in particular. But first I want to cover the pros and cons of each approach.

As to why some think it's often a better idea to hold the lock, I found two main arguments:

From the minute a waiter acquire()s the lock—that is, before releasing it on wait()—it is guaranteed to be notified of signals. If the corresponding release() happened prior to signalling, this would allow the sequence(where P=Producer and C=Consumer) P: release(); C: acquire(); P: notify(); C: wait() in which case the wait() corresponding to the acquire() of the same flow would miss the signal. There are cases where this doesn't matter (and could even be considered to be more accurate), but there are cases where that's undesirable. This is one argument.
When you notify() outside a lock, this may cause a scheduling priority inversion; that is, a low-priority thread might end up taking priority over a high-priority thread. Consider a work queue with one producer and two consumers (LC=Low-priority consumer and HC=High-priority consumer), where LC is currently executing a work item and HC is blocked in wait().

The following sequence may occur:

P                    LC                    HC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                      execute(item)                   (in wait()) lock()                                   wq.push(item) release()                      acquire()                      item = wq.pop()                      release(); notify()                                                      (wake-up)                                                      while (wq.empty())                                                        wait();

Whereas if the notify() happened before release(), LC wouldn't have been able to acquire() before HC had been woken-up. This is where the priority inversion occurred. This is the second argument.

The argument in favor of notifying outside of the lock is for high-performance threading, where a thread need not go back to sleep just to wake-up again the very next time-slice it gets—which was already explained how it might happen in my question.

Python's `threading` Module

In Python, as I said, you must hold the lock while notifying. The irony is that the internal implementation does not allow the underlying OS to avoid priority inversion, because it enforces a FIFO order on the waiters. Of course, the fact that the order of waiters is deterministic could come in handy, but the question remains why enforce such a thing when it could be argued that it would be more precise to differentiate between the lock and the condition variable, for that in some flows that require optimized concurrency and minimal blocking, acquire() should not by itself register a preceding waiting state, but only the wait() call itself.

Arguably, Python programmers would not care about performance to this extent anyway—although that still doesn't answer the question of why, when implementing a standard library, one should not allow several standard behaviors to be possible.

One thing which remains to be said is that the developers of the threading module might have specifically wanted a FIFO order for some reason, and found that this was somehow the best way of achieving it, and wanted to establish that as a Condition at the expense of the other (probably more prevalent) approaches. For this, they deserve the benefit of the doubt until they might account for it themselves.

answered Oct 02 '22 16:10

Yam Marcovic

There are several reasons which are compelling (when taken together).

1. The notifier needs to take a lock

Pretend that Condition.notifyUnlocked() exists.

The standard producer/consumer arrangement requires taking locks on both sides:

def unlocked(qu,cv):  # qu is a thread-safe queue   qu.push(make_stuff())   cv.notifyUnlocked() def consume(qu,cv):   with cv:     while True:       # vs. other consumers or spurious wakeups       if qu: break       cv.wait()     x=qu.pop()   use_stuff(x)

This fails because both the push() and the notifyUnlocked() can intervene between the if qu: and the wait().

Writing either of

def lockedNotify(qu,cv):   qu.push(make_stuff())   with cv: cv.notify() def lockedPush(qu,cv):   x=make_stuff()      # don't hold the lock here   with cv: qu.push(x)   cv.notifyUnlocked()

works (which is an interesting exercise to demonstrate). The second form has the advantage of removing the requirement that qu be thread-safe, but it costs no more locks to take it around the call to notify() as well.

It remains to explain the preference for doing so, especially given that (as you observed) CPython does wake up the notified thread to have it switch to waiting on the mutex (rather than simply moving it to that wait queue).

2. The condition variable itself needs a lock

The Condition has internal data that must be protected in case of concurrent waits/notifications. (Glancing at the CPython implementation, I see the possibility that two unsynchronized notify()s could erroneously target the same waiting thread, which could cause reduced throughput or even deadlock.) It could protect that data with a dedicated lock, of course; since we need a user-visible lock already, using that one avoids additional synchronization costs.

3. Multiple wake conditions can need the lock

(Adapted from a comment on the blog post linked below.)

def setSignal(box,cv):   signal=False   with cv:     if not box.val:       box.val=True       signal=True   if signal: cv.notifyUnlocked() def waitFor(box,v,cv):   v=bool(v)   # to use ==   while True:     with cv:       if box.val==v: break       cv.wait()

Suppose box.val is False and thread #1 is waiting in waitFor(box,True,cv). Thread #2 calls setSignal; when it releases cv, #1 is still blocked on the condition. Thread #3 then calls waitFor(box,False,cv), finds that box.val is True, and waits. Then #2 calls notify(), waking #3, which is still unsatisfied and blocks again. Now #1 and #3 are both waiting, despite the fact that one of them must have its condition satisfied.

def setTrue(box,cv):   with cv:     if not box.val:       box.val=True       cv.notify()

Now that situation cannot arise: either #3 arrives before the update and never waits, or it arrives during or after the update and has not yet waited, guaranteeing that the notification goes to #1, which returns from waitFor.

4. The hardware might need a lock

With wait morphing and no GIL (in some alternate or future implementation of Python), the memory ordering (cf. Java's rules) imposed by the lock-release after notify() and the lock-acquire on return from wait() might be the only guarantee of the notifying thread's updates being visible to the waiting thread.

5. Real-time systems might need it

Immediately after the POSIX text you quoted we find:

however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().

One blog post contains further discussion of the rationale and history of this recommendation (as well as of some of the other issues here).

answered Oct 02 '22 17:10

Davis Herring

Related questions
                            
                                "ValueError: unknown locale: UTF-8" when importing pandas in python 2.7 [duplicate]
                            
                                How to render my TextArea with WTForms?
                            
                                No attribute 'SMTP', error when trying to send email in Python
                            
                                Counting consecutive positive values in Python/pandas array
                            
                                python 2.7 functools_lru_cache does not import although installed
                            
                                Error: command 'gcc' failed with exit status when installing psycopg2
                            
                                Python & MySql: Unicode and Encoding
                            
                                printing a two dimensional array in python
                            
                                Why is the C++ syntax so complicated? [closed]
                            
                                Print range of numbers on same line
                            
                                Cannot upgrade pip 9.0.1 to 9.0.3 - requirement already satisfied
                            
                                Enumerations in python [duplicate]
                            
                                redirect sys.stdout to specific Jupyter Notebook cell
                            
                                How to avoid overlapping when there's hundreds of nodes in networkx?
                            
                                Non-deterministic behavior of TensorFlow while_loop()
                            
                                Why is reading one byte 20x slower than reading 2, 3, 4, ... bytes from a file?
                            
                                Embed plotly graph in a Sphinx doc
                            
                                How do I force `setup.py test` to install dependencies into my `virtualenv`?
                            
                                Python double free error for huge datasets
                            
                                How can I add post-install scripts to easy_install / setuptools / distutils?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does Python threading.Condition() notify() require a lock?

Tags:

python

python-3.x

multithreading

race-condition

condition-variable

Yam Marcovic

People also ask

2 Answers

Python's `threading` Module

Yam Marcovic

1. The notifier needs to take a lock

2. The condition variable itself needs a lock

3. Multiple wake conditions can need the lock

4. The hardware might need a lock

5. Real-time systems might need it

Davis Herring

Recent Activity

Donate For Us

Why does Python threading.Condition() notify() require a lock?

Tags:

python

python-3.x

multithreading

race-condition

condition-variable

Yam Marcovic

People also ask

2 Answers

Python's threading Module

Yam Marcovic

1. The notifier needs to take a lock

2. The condition variable itself needs a lock

3. Multiple wake conditions can need the lock

4. The hardware might need a lock

5. Real-time systems might need it

Davis Herring

Related questions

Recent Activity

Donate For Us

Python's `threading` Module