I'm trying to understand the coroutines in Python (and in general). Been reading about the theory, the concept and a few examples, but I'm still struggling. I understand the asynchronous model (did a bit of Twisted) but not coroutines yet.
One tutorial gives this as a coroutine example (I made a few changes to illustrate my problem):
async def download_coroutine(url, number):
"""
A coroutine to download the specified url
"""
request = urllib.request.urlopen(url)
filename = os.path.basename(url)
print("Downloading %s" % url)
with open(filename, 'wb') as file_handle:
while True:
print(number) # prints numbers to view progress
chunk = request.read(1024)
if not chunk:
print("Finished")
break
file_handle.write(chunk)
msg = 'Finished downloading {filename}'.format(filename=filename)
return msg
This is run with this
coroutines = [download_coroutine(url, number) for number, url in enumerate(urls)]
completed, pending = await asyncio.wait(coroutines)
Looking at generator coroutines examples I can see a few yield
statements. There's nothing here, and urllib is synchronous, AFAIK.
Also, since the code is supposed to be asynchronous, I am expecting to see a series of interleaved numbers. (1, 4, 5, 1, 2, ..., "Finished", ...) . What I'm seeing is a single number repeating ending in a Finished
and then another one (3, 3, 3, 3, ... "Finished", 1, 1, 1, 1, ..., "Finished" ...).
At this point I'm tempted to say the tutorial is wrong, and this is a coroutine just because is has async in front.
Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.
Characteristics of a coroutine are: values of local data persist between successive calls (context switches) execution is suspended as control leaves coroutine and is resumed at certain time later. symmetric or asymmetric control-transfer mechanism; see below.
Coroutines can be used to execute a piece of code across multiple frames. They can also be used to keep executing a section of code until you tell it to stop. A coroutine contains a yield instruction that will wait a certain amount of time you tell it to.
A coroutine is a function that allows pausing its execution and resuming from the same point after a condition is met. We can say, a coroutine is a special type of function used in unity to stop the execution until some certain condition is met and continues from where it had left off.
The co in coroutine stands for cooperative. Yielding (to other routines) makes a routine a co-routine, really, because only by yielding when waiting can other co-routines be interleaved. In the new async
world of Python 3.5 and up, that usually is achieved by await
-ing results from other coroutines.
By that definition, the code you found is not a coroutine. As far as Python is concerned, it is a coroutine object, because that's the type given to a function object created using async def
.
So yes, the tutorial is.. unhelpful, in that they used entirely synchronous, uncooperative code inside a coroutine function.
Instead of urllib
, an asynchronous HTTP library would be needed. Like aiohttp
:
import aiohttp
async def download_coroutine(url):
"""
A coroutine to download the specified url
"""
filename = os.path.basename(url)
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
with open(filename, 'wb') as fd:
while True:
chunk = await resp.content.read(1024)
if not chunk:
break
fd.write(chunk)
msg = 'Finished downloading {filename}'.format(filename=filename)
return msg
This coroutine can yield to other routines when waiting for a connection to be established, and when waiting for more network data, as well as when closing the session again.
We could further make the file writing asynchronous, but that has portability issues; the aiofiles
project library uses threads to off-load the blocking calls to. Using that library, the code would need updating to:
import aiofiles
async with aiofiles.open(filename, 'wb') as fd:
while True:
chunk = await resp.content.read(1024)
if not chunk:
break
await fd.write(chunk)
Note: the blog post has since been updated to fix these issues.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With