Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Waiting for a table to load completely using selenium with python

I want to scrape some data from a page which is in a table. So I am only bothered about the data in the table. Earlier I was using Mechanize, but I found sometimes some of the data are missing, especially in the bottom of the table. Googling, I found out that it may be due to mechanize not handling Jquery/Ajax.

So I switched to Selenium today. How do I wait for one and only one table to load completely and then extract all links from that table using selenium and python? If I wait for complete page to load, it is taking some time. I want to ensure that only data in the table is loaded. My current code:

driver = webdriver.Firefox()
for page in range(1, 2):
    driver.get("http://somesite.com/page/"+str(page))
    table = driver.find_element_by_css_selector('div.datatable')
    links = table.find_elements_by_tag_name('a')
    for link in links:
        print link.text
like image 931
user3215014 Avatar asked Mar 18 '23 19:03

user3215014


1 Answers

Use WebDriverWait to wait until the table is located:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

...
wait = WebDriverWait(driver, 10)
table = wait.until(EC.presence_of_element_located(By.CSS_SELECTOR, 'div.datatable'))

This would be an explicit wait.


Alternatively, you can make the driver wait implicitly:

An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available. The default setting is 0. Once set, the implicit wait is set for the life of the WebDriver object instance.

from selenium import webdriver

driver = webdriver.Firefox()
driver.implicitly_wait(10) # wait up to 10 seconds while trying to locate elements
for page in range(1, 2):
    driver.get("http://somesite.com/page/"+str(page))
    table = driver.find_element_by_css_selector('div.datatable')
    links = table.find_elements_by_tag_name('a')
    for link in links:
        print link.text
like image 198
alecxe Avatar answered Apr 06 '23 14:04

alecxe