How does asyncio.as_completed work

Tags:

Reading this answer, I ran across asyncio.tasks.as_completed. I don't understand how that function actually works. It is documented as being a non-async routine that returns futures in the order they complete. It creates a queue associated with the event loop, adds a completion callback to each future, and then attempts to get as many items from the queue as there are futures.

The core of the code is as follows:

    def _on_completion(f):
        if not todo:
            return  # _on_timeout() was here first.
        todo.remove(f)
        done.put_nowait(f)
        if not todo and timeout_handle is not None:
            timeout_handle.cancel()

    @coroutine
    def _wait_for_one():
        f = yield from done.get()
        if f is None:
            # Dummy value from _on_timeout().
            raise futures.TimeoutError
        return f.result()  # May raise f.exception().

    for f in todo:
        f.add_done_callback(_on_completion)
    if todo and timeout is not None:
        timeout_handle = loop.call_later(timeout, _on_timeout)
    for _ in range(len(todo)):
        yield _wait_for_one()

I'd like to understand how this code works. My biggest questions are:

Where does the loop actually run. I don't see any calls to loop.run_until_cobmplete or loop.run_forever. So how does the loop make progress?
The method documentation says that the method returns futures. That you could call it something like

for f in as_completed(futures): result = yield from f

I'm having trouble reconciling that against the return f.result line in _wait_for_one. Is the documented calling convention correct? If so, where does that yield come from?

429

asked May 18 '17 13:05

Sam Hartman

1 Answers

The code you copied is missing a header part, that is quite imporant.

# This is *not* a @coroutine!  It is just an iterator (yielding Futures).
def as_completed(fs, *, loop=None, timeout=None):
    """Return an iterator whose values are coroutines.

    When waiting for the yielded coroutines you'll get the results (or
    exceptions!) of the original Futures (or coroutines), in the order
    in which and as soon as they complete.

    This differs from PEP 3148; the proper way to use this is:

        for f in as_completed(fs):
            result = yield from f  # The 'yield from' may raise.
            # Use result.

    If a timeout is specified, the 'yield from' will raise
    TimeoutError when the timeout occurs before all Futures are done.

    Note: The futures 'f' are not necessarily members of fs.
    """
    if futures.isfuture(fs) or coroutines.iscoroutine(fs):
        raise TypeError("expect a list of futures, not %s" % type(fs).__name__)
    loop = loop if loop is not None else events.get_event_loop()
    todo = {ensure_future(f, loop=loop) for f in set(fs)}
    from .queues import Queue  # Import here to avoid circular import problem.
    done = Queue(loop=loop)
    timeout_handle = None

    def _on_timeout():
        for f in todo:
            f.remove_done_callback(_on_completion)
            done.put_nowait(None)  # Queue a dummy value for _wait_for_one().
        todo.clear()  # Can't do todo.remove(f) in the loop.

    def _on_completion(f):
        if not todo:
            return  # _on_timeout() was here first.
        todo.remove(f)
        done.put_nowait(f)
        if not todo and timeout_handle is not None:
            timeout_handle.cancel()

    @coroutine
    def _wait_for_one():
        f = yield from done.get()
        if f is None:
            # Dummy value from _on_timeout().
            raise futures.TimeoutError
        return f.result()  # May raise f.exception().

    for f in todo:
        f.add_done_callback(_on_completion)
    if todo and timeout is not None:
        timeout_handle = loop.call_later(timeout, _on_timeout)
    for _ in range(len(todo)):
        yield _wait_for_one()

[Where does the loop actually run?]

For semplicity sake, suppose that timeout is set to None.

The as_completed expects an iterable of futures, not coroutines. So this futures are already bound to the loop and scheduled for execution. In other terms those futures are the output of loop.create_task or asyncio.ensure_futures (and this is written nowhere explicitly). So the loop is already "running" them and when they will complete, their future .done() method will return True.

Then the "done" queue is created. Note that The "done" queue is an istance of asyncio.queue, i.e. a queue that implements the blocking method (.get, .put) »using the loop«.

By the line "todo = { ...", each coroutine's future (that is an element of fs) is wrapped in another future »bound to the loop«, and this last future's done_callback is set to call the _on_completion function.

The _on_completion function, will be called when the loop will complete the execution of the coroutine, whose futures was passed in the "fs" set to the as_completed function.

The _on_completion function removes "our future" from the todo set and puts the its result (i.e. the coroutine whose future was in the "fs" set) in the done queue. In other terms, all that the as_completed function does, is attaching these futures with a done_callback so that the result of the original future is moved into the done queue.

Then, for len(fs) == len(todo) times, the as_completed function yields a coroutine that blocks "yield from done.get()", waiting for the _on_completed (or the _on_timeout) function to put a result into the done the done queue.

The "yield from"s, executed by the as_completed caller, will wait for a result to appear in the done queue.

[where does that yield come from?]

It comes from the fact that todo is an asyncio.queue, so you can (asyncio-)block until a value is .put() in the queue.

197

answered Oct 06 '22 23:10

MadeR

Related questions
                            
                                rm() function of r alternative in python
                            
                                Was the year 1000 (and others) a leap year?
                            
                                Test Environment with Mocked REST API
                            
                                Selenium Add Cookies From CookieJar
                            
                                ValueError: Must pass DataFrame with boolean values only
                            
                                Do bulk inserts/update in MongoDB with PyMongo
                            
                                Find the intersection of two curves given by (x, y) data with high precision in Python
                            
                                Is matplotlib scatter plot slow for large number of data?
                            
                                Can one only implement gradient descent like optimizers with the code example from processing gradients in TensorFlow?
                            
                                How to measure the accuracy of predictions using Python/Pandas?
                            
                                Difference between var and Symbol in sympy
                            
                                What is the difference between applying, running and calling of celery task?
                            
                                No File was submitted while trying to make a POST. Django Rest Framework
                            
                                Get method of a class in the order that it was in the code
                            
                                Durbin–Watson statistic for one dimensional time series data
                            
                                How can I push AWS CodeCommit to S3 using Lambda?
                            
                                How to create sql alchemy connection for pandas read_sql with sqlalchemy+pyodbc and multiple databases in MS SQL Server?
                            
                                numpy mean of complex numbers with infinities
                            
                                Matplotlib/seaborn histogram using different colors for grouped bins
                            
                                Python 3 - matplitlib text inside vertical line

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does asyncio.as_completed work

Tags:

python

python-3.x

python-asyncio

Sam Hartman

People also ask

1 Answers

MadeR

Recent Activity

Donate For Us