Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between coroutine and future/task in Python 3.5?

Let's say we have a dummy function:

async def foo(arg):
    result = await some_remote_call(arg)
    return result.upper()

What's the difference between:

import asyncio    

coros = []
for i in range(5):
    coros.append(foo(i))

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(coros))

And:

import asyncio

futures = []
for i in range(5):
    futures.append(asyncio.ensure_future(foo(i)))

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(futures))

Note: The example returns a result, but this isn't the focus of the question. When return value matters, use gather() instead of wait().

Regardless of return value, I'm looking for clarity on ensure_future(). wait(coros) and wait(futures) both run the coroutines, so when and why should a coroutine be wrapped in ensure_future?

Basically, what's the Right Way (tm) to run a bunch of non-blocking operations using Python 3.5's async?

For extra credit, what if I want to batch the calls? For example, I need to call some_remote_call(...) 1000 times, but I don't want to crush the web server/database/etc with 1000 simultaneous connections. This is doable with a thread or process pool, but is there a way to do this with asyncio?

2020 update (Python 3.7+): Don't use these snippets. Instead use:

import asyncio

async def do_something_async():
    tasks = []
    for i in range(5):
        tasks.append(asyncio.create_task(foo(i)))
    await asyncio.gather(*tasks)

def do_something():
    asyncio.run(do_something_async)

Also consider using Trio, a robust 3rd party alternative to asyncio.

like image 377
knite Avatar asked Jan 12 '16 20:01

knite


People also ask

What is the use of coroutine in Python?

Coroutines are generalizations of subroutines. They are used for cooperative multitasking where a process voluntarily yield (give away) control periodically or when idle in order to enable multiple applications to be run simultaneously.

What is a future in Asyncio?

In short, future is the more general concept of a container of an async result, akin to a JavaScript promise. Task is a subclass of future specialized for executing coroutines. Nothing in the definition of asyncio future indicates multi-threaded execution, and asyncio is in fact strongly single-threaded.

What does coroutine mean?

A coroutine is an instance of suspendable computation. It is conceptually similar to a thread, in the sense that it takes a block of code to run that works concurrently with the rest of the code. However, a coroutine is not bound to any particular thread.

What is a python future object?

Future (*, loop=None) A Future represents an eventual result of an asynchronous operation. Not thread-safe. Future is an awaitable object. Coroutines can await on Future objects until they either have a result or an exception set, or until they are cancelled.


4 Answers

A coroutine is a generator function that can both yield values and accept values from the outside. The benefit of using a coroutine is that we can pause the execution of a function and resume it later. In case of a network operation, it makes sense to pause the execution of a function while we're waiting for the response. We can use the time to run some other functions.

A future is like the Promise objects from Javascript. It is like a placeholder for a value that will be materialized in the future. In the above-mentioned case, while waiting on network I/O, a function can give us a container, a promise that it will fill the container with the value when the operation completes. We hold on to the future object and when it's fulfilled, we can call a method on it to retrieve the actual result.

Direct Answer: You don't need ensure_future if you don't need the results. They are good if you need the results or retrieve exceptions occurred.

Extra Credits: I would choose run_in_executor and pass an Executor instance to control the number of max workers.

Explanations and Sample codes

In the first example, you are using coroutines. The wait function takes a bunch of coroutines and combines them together. So wait() finishes when all the coroutines are exhausted (completed/finished returning all the values).

loop = get_event_loop() # 
loop.run_until_complete(wait(coros))

The run_until_complete method would make sure that the loop is alive until the execution is finished. Please notice how you are not getting the results of the async execution in this case.

In the second example, you are using the ensure_future function to wrap a coroutine and return a Task object which is a kind of Future. The coroutine is scheduled to be executed in the main event loop when you call ensure_future. The returned future/task object doesn't yet have a value but over time, when the network operations finish, the future object will hold the result of the operation.

from asyncio import ensure_future

futures = []
for i in range(5):
    futures.append(ensure_future(foo(i)))

loop = get_event_loop()
loop.run_until_complete(wait(futures))

So in this example, we're doing the same thing except we're using futures instead of just using coroutines.

Let's look at an example of how to use asyncio/coroutines/futures:

import asyncio


async def slow_operation():
    await asyncio.sleep(1)
    return 'Future is done!'


def got_result(future):
    print(future.result())

    # We have result, so let's stop
    loop.stop()


loop = asyncio.get_event_loop()
task = loop.create_task(slow_operation())
task.add_done_callback(got_result)

# We run forever
loop.run_forever()

Here, we have used the create_task method on the loop object. ensure_future would schedule the task in the main event loop. This method enables us to schedule a coroutine on a loop we choose.

We also see the concept of adding a callback using the add_done_callback method on the task object.

A Task is done when the coroutine returns a value, raises an exception or gets canceled. There are methods to check these incidents.

I have written some blog posts on these topics which might help:

  • http://masnun.com/2015/11/13/python-generators-coroutines-native-coroutines-and-async-await.html
  • http://masnun.com/2015/11/20/python-asyncio-future-task-and-the-event-loop.html
  • http://masnun.com/2015/12/07/python-3-using-blocking-functions-or-codes-with-asyncio.html

Of course, you can find more details on the official manual: https://docs.python.org/3/library/asyncio.html

like image 80
masnun Avatar answered Oct 11 '22 08:10

masnun


Simple answer

  • Invoking a coroutine function(async def) does NOT run it. It returns a coroutine objects, like generator function returns generator objects.
  • await retrieves values from coroutines, i.e. "calls" the coroutine
  • eusure_future/create_task schedule the coroutine to run on the event loop on next iteration(although not waiting them to finish, like a daemon thread).

Some code examples

Let's first clear some terms:

  • coroutine function, the one you async defs;
  • coroutine object, what you got when you "call" a coroutine function;
  • task, a object wrapped around a coroutine object to run on the event loop.

Case 1, await on a coroutine

We create two coroutines, await one, and use create_task to run the other one.

import asyncio
import time

# coroutine function
async def p(word):
    print(f'{time.time()} - {word}')


async def main():
    loop = asyncio.get_event_loop()
    coro = p('await')  # coroutine
    task2 = loop.create_task(p('create_task'))  # <- runs in next iteration
    await coro  # <-- run directly
    await task2

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

you will get result:

1539486251.7055213 - await
1539486251.7055705 - create_task

Explain:

task1 was executed directly, and task2 was executed in the following iteration.

Case 2, yielding control to event loop

If we replace the main function, we can see a different result:

async def main():
    loop = asyncio.get_event_loop()
    coro = p('await')
    task2 = loop.create_task(p('create_task'))  # scheduled to next iteration
    await asyncio.sleep(1)  # loop got control, and runs task2
    await coro  # run coro
    await task2

you will get result:

-> % python coro.py
1539486378.5244057 - create_task
1539486379.5252144 - await  # note the delay

Explain:

When calling asyncio.sleep(1), the control was yielded back to the event loop, and the loop checks for tasks to run, then it runs the task created by create_task.

Note that, we first invoke the coroutine function, but not await it, so we just created a single coroutine, and not make it running. Then, we call the coroutine function again, and wrap it in a create_task call, creat_task will actually schedule the coroutine to run on next iteration. So, in the result, create task is executed before await.

Actually, the point here is to give back control to the loop, you could use asyncio.sleep(0) to see the same result.

Under the hood

loop.create_task actually calls asyncio.tasks.Task(), which will call loop.call_soon. And loop.call_soon will put the task in loop._ready. During each iteration of the loop, it checks for every callbacks in loop._ready and runs it.

asyncio.wait, asyncio.ensure_future and asyncio.gather actually call loop.create_task directly or indirectly.

Also note in the docs:

Callbacks are called in the order in which they are registered. Each callback will be called exactly once.

like image 44
ospider Avatar answered Oct 11 '22 07:10

ospider


A comment by Vincent linked to https://github.com/python/asyncio/blob/master/asyncio/tasks.py#L346, which shows that wait() wraps the coroutines in ensure_future() for you!

In other words, we do need a future, and coroutines will be silently transformed into them.

I'll update this answer when I find a definitive explanation of how to batch coroutines/futures.

like image 34
knite Avatar answered Oct 11 '22 08:10

knite


From the BDFL [2013]

Tasks

  • It's a coroutine wrapped in a Future
  • class Task is a subclass of class Future
  • So it works with await too!

  • How does it differ from a bare coroutine?
  • It can make progress without waiting for it
    • As long as you wait for something else, i.e.
      • await [something_else]

With this in mind, ensure_future makes sense as a name for creating a Task since the Future's result will be computed whether or not you await it (as long as you await something). This allows the event loop to complete your Task while you're waiting on other things. Note that in Python 3.7 create_task is the preferred way ensure a future.

Note: I changed "yield from" in Guido's slides to "await" here for modernity.

like image 33
crizCraig Avatar answered Oct 11 '22 06:10

crizCraig