Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

limit number of concurrent requests aiohttp

I'm downloading images using aiohttp, and was wondering if there is a way to limit the number of open requests that haven't finished. This is the code I currently have:

async def get_images(url, session):

    chunk_size = 100

    # Print statement to show when a request is being made. 
    print(f'Making request to {url}')

    async with session.get(url=url) as r:
        with open('path/name.png', 'wb') as file:
            while True:
                chunk = await r.content.read(chunk_size)
                if not chunk:
                    break
                file.write(chunk)

# List of urls to get images from
urls = [...]

conn = aiohttp.TCPConnector(limit=3)
loop = asyncio.get_event_loop()
session = aiohttp.ClientSession(connector=conn, loop=loop)
loop.run_until_complete(asyncio.gather(*(get_images(url, session=session) for url in urls)))

The problem is, I threw a print statement in to show me when each request is being made and it is making almost 21 requests at once, instead of the 3 that I am wanting to limit it to (i.e., once an image is done downloading, it can move on to the next url in the list to get). I'm just wondering what I am doing wrong here.

like image 379
Jasonca1 Avatar asked May 05 '18 14:05

Jasonca1


People also ask

Is Aiohttp better than requests?

get is that requests fetches the whole body of the response at once and remembers it, but aiohttp doesn't. aiohttp lets you ignore the body, or read it in chunks, or read it after looking at the headers/status code. That's why you need to do a second await : aiohttp needs to do more I/O to get the response body.

What is client session Aiohttp?

Client session is the recommended interface for making HTTP requests. Session encapsulates a connection pool (connector instance) and supports keepalives by default.

How do I pass headers in Aiohttp?

If you need to add HTTP headers to a request, pass them in a dict to the headers parameter. await session. post(url, data='Привет, Мир! ')


1 Answers

asyncio.Semaphore solves exactly this issue.

In your case it'll be something like this:

semaphore = asyncio.Semaphore(3)


async def get_images(url, session):

    async with semaphore:

        print(f'Making request to {url}')

        # ...

You may also be interested to take a look at this ready-to-run code example that demonstrates how semaphore works.

like image 191
Mikhail Gerasimov Avatar answered Sep 17 '22 10:09

Mikhail Gerasimov