Reading the asyncio documentation, I realize that I don't understand a very basic and fundamental aspect: the difference between awaiting a coroutine directly, and awaiting the same coroutine when it's wrapped inside a task.
In the documentation examples the two calls to the say_after
coroutine are running sequentially when awaited without create_task
, and concurrently when wrapped in create_task
. So I understand that this is basically the difference, and that it is quite an important one.
However what confuses me is that in the example code I read everywhere (for instance showing how to use aiohttp
), there are many places where a (user-defined) coroutine is awaited (usually in the middle of some other user-defined coroutine) without being wrapped in a task, and I'm wondering why that is the case. What are the criteria to determine when a coroutine should be wrapped in a task or not?
Tasks within Asyncio are responsible for the execution of coroutines within an event loop. These tasks can only run in one event loop at one time and in order to achieve parallel execution you would have to run multiple event loops over multiple threads.
@coroutine is a functional analogue for async def but it works in Python 3.4+ and utilizes yield from construction instead of await . For practical perspective just never use @coroutine if your Python is 3.5+.
Coroutines are generalizations of subroutines. They are used for cooperative multitasking where a process voluntarily yield (give away) control periodically or when idle in order to enable multiple applications to be run simultaneously.
It should be used as a main entry point for asyncio programs, and should ideally only be called once. New in version 3.7.
What are the criteria to determine when a coroutine should be wrapped in a task or not?
You should use a task when you want your coroutine to effectively run in the background. The code you've seen just awaits the coroutines directly because it needs them running in sequence. For example, consider an HTTP client sending a request and waiting for a response:
# these two don't make too much sense in parallel
await session.send_request(req)
resp = await session.read_response()
There are situations when you want operations to run in parallel. In that case asyncio.create_task
is the appropriate tool, because it turns over the responsibility to execute the coroutine to the event loop. This allows you to start several coroutines and sit idly while they execute, typically waiting for some or all of them to finish:
dl1 = asyncio.create_task(session.get(url1))
dl2 = asyncio.create_task(session.get(url2))
# run them in parallel and wait for both to finish
resp1 = await dl1
resp2 = await dl2
# or, shorter:
resp1, resp2 = asyncio.gather(session.get(url1), session.get(url2))
As shown above, a task can be awaited as well. Just like awaiting a coroutine, that will block the current coroutine until the coroutine driven by the task has completed. In analogy to threads, awaiting a task is roughly equivalent to join()-ing a thread (except you get back the return value). Another example:
queue = asyncio.Queue()
# read output from process in an infinite loop and
# put it in a queue
async def process_output(cmd, queue, identifier):
proc = await asyncio.create_subprocess_shell(cmd)
while True:
line = await proc.readline()
await queue.put((identifier, line))
# create multiple workers that run in parallel and pour
# data from multiple sources into the same queue
asyncio.create_task(process_output("top -b", queue, "top")
asyncio.create_task(process_output("vmstat 1", queue, "vmstat")
while True:
identifier, output = await queue.get()
if identifier == 'top':
# ...
In summary, if you need the result of a coroutine in order to proceed, you should just await it without creating a task, i.e.:
# this is ok
resp = await session.read_response()
# unnecessary - it has the same effect, but it's
# less efficient
resp = await asyncio.create_task(session.read_reponse())
To continue with the threading analogy, creating a task just to await it immediately is like running t = Thread(target=foo); t.start(); t.join()
instead of just foo()
- inefficient and redundant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With