Is it possible to have multiple loops with asyncio? If the response is yes how can I do that? My use case is: * I extract urls from a list of websites in async * For each "sub url list", I would crawl them in async/
Example to extract urls:
import asyncio import aiohttp from suburls import extractsuburls @asyncio.coroutine def extracturls(url): subtasks = [] response = yield from aiohttp.request('GET', url) suburl_list = yield from response.text() for suburl in suburl_list: subtasks.append(asyncio.Task(extractsuburls(suburl))) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*subtasks)) if __name__ == '__main__': urls_list = ['http://example1.com', 'http://example2.com'] for url in url_list: subtasks.append(asyncio.Task(extractsuburls(url))) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*subtasks)) loop.close()
If I execute this code I'll have an error when python will try to launch the second loop witch says that a loop is already running.
P.S: my module "extractsuburls" uses aiohttp to perform web request.
EDIT:
Well, I've try this solution:
import asyncio import aiohttp from suburls import extractsuburls @asyncio.coroutine def extracturls( url ): subtasks = [] response = yield from aiohttp.request('GET', url) suburl_list = yield from response.text() jobs_loop = asyncio.new_event_loop() for suburl in suburl_list: subtasks.append(asyncio.Task(extractsuburls(suburl))) asyncio.new_event_loop(jobs_loop) jobs_loop.run_until_complete(asyncio.gather(*subtasks)) jobs_loop.close() if __name__ == '__main__': urls_list = ['http://example1.com', 'http://example2.com'] for url in url_list: subtasks.append(asyncio.Task(extractsuburls(url))) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*subtasks)) loop.close()
But I've this error: loop argument must agree with Future
Any idea?
When we utilize asyncio we create objects called coroutines. A coroutine can be thought of as executing a lightweight thread. Much like we can have multiple threads running at the same time, each with their own concurrent I/O operation, we can have many coroutines running alongside one another.
It should be used as a main entry point for asyncio programs, and should ideally only be called once. New in version 3.7.
asyncio has an API for interoperating with Python's multiprocessing library. This lets us use async await syntax as well as asyncio APIs with multiple processes.
One of the cool advantages of asyncio is that it scales far better than threading . Each task takes far fewer resources and less time to create than a thread, so creating and running more of them works well.
You don't need several event loops, just use yield from gather(*subtasks)
in extracturls()
coroutine:
import asyncio import aiohttp from suburls import extractsuburls @asyncio.coroutine def extracturls(url): subtasks = [] response = yield from aiohttp.request('GET', url) suburl_list = yield from response.text() for suburl in suburl_list: subtasks.append(extractsuburls(suburl)) yield from asyncio.gather(*subtasks) if __name__ == '__main__': urls_list = ['http://example1.com', 'http://example2.com'] for url in url_list: subtasks.append(extractsuburls(url)) loop = asyncio.get_event_loop() loop.run_until_complete(asyncio.gather(*subtasks)) loop.close()
As result you get waiting for subtasks until extracturls
finished.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With