Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the lock in asyncio.Condition have other purpose besides compatibility with threading.Condition?

I'd like to ask about asyncio.Condition. I'm not familiar with the concept, but I know and understand locks, semaphores, and queues since my student years.

I could not find a good explanation or typical use cases, just this example. I looked at the source. The core fnctionality is achieved with a FIFO of futures. Each waiting coroutine adds a new future and awaits it. Another coroutine may call notify() which sets the result of one or optionally more futures from the FIFO and that wakes up the same number of waiting coroutines. Really simple up to this point.

However, the implementation and the usage is more complicated than this. A waiting coroutine must first acquire a lock associated with the condition in order to be able to wait (and the wait() releases it while waiting). Also the notifier must acquire a lock to be able to notify(). This leads to with statement before each operation:

async with condition:
    # condition operation (wait or notify)

or else a RuntimeError occurrs.

I do not understand the point of having this lock. What resource do we need to protect with the lock? In asyncio there could be always only one coroutine executing in the event loop, there are no "critical sections" as known from threading.

Is this lock really needed (why?) or is it for compatibility with threading code only?

My first idea was it is for the compatibility, but in such case why didn't they remove the lock while preserving the usage? i.e. making

async with condition:

basically an optional no-op.

like image 305
VPfB Avatar asked Jul 25 '18 10:07

VPfB


1 Answers

The answer for this is essentially the same as for threading.Condition vs threading.Event; a condition without a lock is an event, not a condition(*).

Conditions are used to signal that a resource is available. Whomever was waiting for the condition, can use that resource until they are done with it. To ensure that no-one else can use the resource, you need to lock the resource:

resource = get_some_resource()

async with resource.condition:
    await resource.condition.wait()
    # this resource is mine, no-one will touch it
    await resource.do_something_async()

# lock released, resource is available again for the next user

Note how the lock is not released after wait() resumes! Until the lock is released, no other co-routine waiting for the same condition can proceed, access to the resource is made exclusive by virtue of the lock. Note that the lock is released while waiting, so other coroutines can add themselves to the queue, but for wait() to finally return the lock must first be re-acquired.

If you don't need to coordinate access to a shared resource, use an event; a condition is basically a lock and event combined into one primitive, avoiding common implementation pitfalls.

Note that multiple conditions can share locks. This would let you signal specific stages, and other coroutines can wait for that specific stage to arrive. The shared lock would coordinate access to a single resource, but different conditions are signalled when each stage is initiated.

For threading, the typical use-case for conditions offered is that of a single producer, and multiple consumers all waiting on items from the producer to process. The work queue is the shared resource, the producer acquires the condition lock to push an item into the queue and then call notify(), at which point the next consumer waiting on the condition is given the lock (as it returns from wait()) and can remove the item from the queue to work on. This doesn't quite translate to a coroutine-based application, as coroutines don't have the sitting-idle-waiting-for-work-to-be-done problems threading systems have, it's much easier to just spin up consumer co-routines as needed (with perhaps a semaphore to impose a ceiling).

Perhaps a better example is the aioimaplib library, which supports IMAP4 transactions in full. These transactions are asynchronous, but you need to have access to the shared connection resource. So the library uses a single Condition object and wait_for() to wait for a specific state to arrive and thus give exclusive connection access to the coroutine waiting for that transaction state.


(*): Events have a different use-case from conditions, and thus behave a little different from a condition without locking. Once set, an event needs to be cleared explicitly, while a condition 'auto-clears' when used, and is never 'set' when no-one is waiting on the condition. But if you want to signal between tasks and don't need to control access to a shared resource, then you probably wanted an event.

like image 87
Martijn Pieters Avatar answered Oct 31 '22 09:10

Martijn Pieters