Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is aiohttp horribly slower than gevent?

Disclaimer: I am a total beginner in aiohttp

I was experimenting with aiohttp to handle get requests asynchronously but It turned out to be horribly slower than the pool version of gevent.

GEVENT VERSION

import gevent
from gevent import monkey
monkey.patch_all()
from gevent.pool import Pool

import requests
import time

def pooling_task(url):
    requests.get(url)


def pooling_main():
    start = time.time()
    pool = Pool(10)
    urls = [
        "http://google.com",
        "http://yahoo.com",
        "http://linkedin.com",
        "http://shutterfly.com",
        "http://mypublisher.com",
        "http://facebook.com"
    ]
    for url in urls:
        pool.apply_async(pooling_task, args=(url,))

    pool.join()
    end = time.time()
    print("POOL TIME {}".format(end-start))

if __name__ == '__main__':
    print("POOLING VERSION")
    pooling_main()

OUTPUT - POOL TIME 6.299163818359375

Following is the aiohttp version

import aiohttp
import asyncio
import time
import uvloop

asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()


async def main():
    urls = [
        "http://google.com",
        "http://yahoo.com",
        "http://linkedin.com",
        "http://shutterfly.com",
        "http://mypublisher.com",
        "http://facebook.com"]

    async with aiohttp.ClientSession() as session:
        for url in urls:
            await fetch(session, url)

if __name__ == "__main__":
    start = time.time()
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    end = time.time()
    print("Time taken {}".format(end - start))

OUTPUT - Time taken 15.399710178375244

I really don't understand why aiohttp is so much slower. As for gevent version requests.get is still a blocking call , but not for aiohttp.

I expected aiohttp version to be faster.

like image 651
Ishan Bhatt Avatar asked Dec 05 '22 11:12

Ishan Bhatt


1 Answers

for url in urls:
    await fetch(session, url)

await here means that you don't start downloading next url before previous done. To make all downloadings concurrent you should use something like asyncio.gather.

Modify your code like this:

async with aiohttp.ClientSession() as session:
    await asyncio.gather(*[
        fetch(session, url)
        for url
        in urls
    ])

You'll see huge speedup.

like image 77
Mikhail Gerasimov Avatar answered Dec 26 '22 10:12

Mikhail Gerasimov