Is it possible open browser for each task firstly, and load links after that ? This code raises an error
import asyncio
from selenium import webdriver
async def get_html(url):
driver = await webdriver.Chrome()
response = await driver.get(url)
TypeError: object WebDriver can't be used in 'await' expression
If you want to use Selenium in an async fashion I would suggest using multiple instances of the Driver and a executor like this:
import asyncio
from concurrent.futures.thread import ThreadPoolExecutor
from selenium import webdriver
executor = ThreadPoolExecutor(10)
def scrape(url, *, loop):
loop.run_in_executor(executor, scraper, url)
def scraper(url):
driver = webdriver.Chrome("./chromedriver")
driver.get(url)
loop = asyncio.get_event_loop()
for url in ["https://google.de"] * 2:
scrape(url, loop=loop)
loop.run_until_complete(asyncio.gather(*asyncio.all_tasks(loop)))
Please note that you can run selenium in headless mode so you don't need to spawn the whole GUI for calling some simple url.
The problem was discussed at: https://github.com/SeleniumHQ/selenium/issues/3399
If you want to have async webdrivers, there are two libraries you can use:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With