I'm trying to call ~ 300 API calls at the same time, so that I would get the results in a couple of seconds max.
My pseudo-code looks like this:
def function_1():
colors = ['yellow', 'green', 'blue', + ~300 other ones]
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
res = loop.run_until_complete(get_color_info(colors))
async def get_color_info(colors):
loop = asyncio.get_event_loop()
responses = []
for color in colors:
print("getting color")
url = "https://api.com/{}/".format(color)
data = loop.run_in_executor(None, requests.get, url)
r = await data
responses.append(r.json())
return responses
Doing this I get getting color
printed out every second or so and the code takes forever, so I'm pretty sure they don't run simultaneously. What am I doing wrong?
In order to run multiple async/await calls in parallel, all we need to do is add the calls to an array, and then pass that array as an argument to Promise. all() . Promise. all() will wait for all the provided async calls to be resolved before it carries on(see Conclusion for caveat).
The built-in concurrent library So, threads in Python have more to do with concurrency, than with parallelism. Lines 1–3 are the imported libraries we need. We'll use the requests library for sending HTTP requests to the API, and we'll use the concurrent library for executing them concurrently.
Concurrency Handling Functions are first-class objects in Python, which means they can be passed as arguments to other functions. AsyncIO ships with the awaitable asyncio. gather() function. It is used to run concurrent functions in a given sequence, as shown in the below code snippet.
aiohttp
with Native Coroutines (async
/await
)Here is a typical pattern that accomplishes what you're trying to do. (Python 3.7+.)
One major change is that you will need to move from requests
, which is built for synchronous IO, to a package such as aiohttp
that is built specifically to work with async
/await
(native coroutines):
import asyncio
import aiohttp # pip install aiohttp aiodns
async def get(
session: aiohttp.ClientSession,
color: str,
**kwargs
) -> dict:
url = f"https://api.com/{color}/"
print(f"Requesting {url}")
resp = await session.request('GET', url=url, **kwargs)
# Note that this may raise an exception for non-2xx responses
# You can either handle that here, or pass the exception through
data = await resp.json()
print(f"Received data for {url}")
return data
async def main(colors, **kwargs):
# Asynchronous context manager. Prefer this rather
# than using a different session for each GET request
async with aiohttp.ClientSession() as session:
tasks = []
for c in colors:
tasks.append(get(session=session, color=c, **kwargs))
# asyncio.gather() will wait on the entire task set to be
# completed. If you want to process results greedily as they come in,
# loop over asyncio.as_completed()
htmls = await asyncio.gather(*tasks, return_exceptions=True)
return htmls
if __name__ == '__main__':
colors = ['red', 'blue', 'green'] # ...
# Either take colors from stdin or make some default here
asyncio.run(main(colors)) # Python 3.7+
There are two distinct elements to this, one being the asynchronous aspect of the coroutines and one being the concurrency introduced on top of that when you specify a container of tasks (futures):
get
that uses await
with two awaitables: the first being .request
and the second being .json
. This is the async aspect. The purpose of await
ing these IO-bound responses is to tell the event loop that other get()
calls can take turns running through that same routine.await asyncio.gather(*tasks)
. This maps the awaitable get()
call to each of your colors
. The result is an aggregate list of returned values. Note that this wrapper will wait until all of your responses come in and call .json()
. If, alternatively, you want to process them greedily as they are ready, you can loop over asyncio.as_completed
: each Future object returned represents the earliest result from the set of the remaining awaitables.Lastly, take note that asyncio.run()
is a high-level "porcelain" function introduced in Python 3.7. In earlier versions, you can mimic it (roughly) like:
# The "full" versions makes a new event loop and calls
# loop.shutdown_asyncgens(), see link above
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main(colors))
finally:
loop.close()
There are a number of ways to limit the rate of concurrency. For instance, see asyncio.semaphore
in async-await function or large numbers of tasks with limited concurrency.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With