I would like to use asyncio to get webpage html.
I run the following code in jupyter notebook:
import aiofiles
import aiohttp
from aiohttp import ClientSession
async def get_info(url, session):
resp = await session.request(method="GET", url=url)
resp.raise_for_status()
html = await resp.text(encoding='GB18030')
with open('test_asyncio.html', 'w', encoding='utf-8-sig') as f:
f.write(html)
return html
async def main(urls):
async with ClientSession() as session:
tasks = [get_info(url, session) for url in urls]
return await asyncio.gather(*tasks)
if __name__ == "__main__":
url = ['http://huanyuntianxiazh.fang.com/house/1010123799/housedetail.htm', 'http://zhaoshangyonghefu010.fang.com/house/1010126863/housedetail.htm']
result = asyncio.run(main(url))
However, it returns RuntimeError: asyncio.run() cannot be called from a running event loop
What is the problem?
How to solve it?
The event loop is the core of every asyncio application. Event loops run asynchronous tasks and callbacks, perform network IO operations, and run subprocesses. Application developers should typically use the high-level asyncio functions, such as asyncio.
asyncio. run() , introduced in Python 3.7, is responsible for getting the event loop, running tasks until they are marked as complete, and then closing the event loop.
Run an asyncio Event Loop run_until_complete(<some Future object>) – this function runs a given Future object, usually a coroutine defined by the async / await pattern, until it's complete. run_forever() – this function runs the loop forever. stop() – the stop function stops a running loop.
It should be used as a main entry point for asyncio programs, and should ideally only be called once. New in version 3.7.
The asyncio.run()
documentation says:
This function cannot be called when another asyncio event loop is running in the same thread.
In your case, jupyter (IPython ≥ 7.0) is already running an event loop:
You can now use async/await at the top level in the IPython terminal and in the notebook, it should — in most of the cases — “just work”. Update IPython to version 7+, IPykernel to version 5+, and you’re off to the races.
Therefore you don't need to start the event loop yourself and can instead call await main(url)
directly, even if your code lies outside any asynchronous function.
Jupyter / IPython
async def main():
print(1)
await main()
Python (≥ 3.7) or older versions of IPython
import asyncio
async def main():
print(1)
asyncio.run(main())
In your code that would give:
url = ['url1', 'url2']
result = await main(url)
for text in result:
pass # text contains your html (text) response
Caution
There is a slight difference on how Jupyter uses the loop compared to IPython.
To add to cglacet
's answer - if one wants to detect whether a loop is running and adjust automatically (ie run main()
on the existing loop, otherwise asyncio.run()
), here is a snippet that may prove useful:
# async def main():
# ...
try:
loop = asyncio.get_running_loop()
except RuntimeError: # 'RuntimeError: There is no current event loop...'
loop = None
if loop and loop.is_running():
print('Async event loop already running. Adding coroutine to the event loop.')
tsk = loop.create_task(main())
# ^-- https://docs.python.org/3/library/asyncio-task.html#task-object
# Optionally, a callback function can be executed when the coroutine completes
tsk.add_done_callback(
lambda t: print(f'Task done with result={t.result()} << return val of main()'))
else:
print('Starting new event loop')
result = asyncio.run(main())
Just use this:
https://github.com/erdewit/nest_asyncio
import nest_asyncio
nest_asyncio.apply()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With