Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selenium: Wait until text in WebElement changes

I'm using selenium with Python 2.7. to retrieve the contents from a search box on a webpage. The search box dynamically retrieves and displays the results in the box itself.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
import re
from time import sleep

driver = webdriver.Firefox()
driver.get(url)

df = pd.read_csv("read.csv")

def crawl(isin):
    searchkey = driver.find_element_by_name("searchkey")
    searchkey.clear()
    searchkey.send_keys(isin)
    sleep(11)

    search_result = driver.find_element_by_class_name("ac_results")
    names = re.match(r"^.*(?=(\())", search_result.text).group().encode("utf-8")
    product_id = re.findall(r"((?<=\()[0-9]*)", search_result.text)
    return pd.Series([product_id, names])

df[["insref", "name"]] = df["ISIN"].apply(crawl)

print df

Relevant part of the code may be found under def crawl(isin):

  • The program enters what to search for in the search box (the box is badly named as searchkey).
  • It then does sleep() and waits for the content to show in the search box dropdown field ac_results.
  • Then gets two variables insrefs and names with Regex.

Instead of calling sleep(), I would like for it to wait for the content in the WebElement ac_results to load.

Since it will continuously use the search box to get new data by entering new search terms from a list, one could perhaps use Regex to identify when there is new content in ac_results that is not identical to the previous content.

Is there a method for this? It is important to note that the content in the search box is dynamically loaded, so the function would have to recognise that something has changed in the WebElement.

like image 790
P A N Avatar asked Jun 21 '15 13:06

P A N


2 Answers

You need to apply the Explicit Wait concept. E.g. wait for an element to become visible:

wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'searchbox')))

Here, it would wait up to 10 seconds checking the visibility of the element every 500 ms.

There is a set of built-in Expected Conditions to wait for and it is also easy to write your custom Expected Condition.


FYI, here is how we approached it after brainstorming it in the chat. We've introduced a custom Expected Condition that would wait for the element text to change. It helped us to identify when the new search results appear:

import re

import pandas as pd
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import _find_element

class text_to_change(object):
    def __init__(self, locator, text):
        self.locator = locator
        self.text = text

    def __call__(self, driver):
        actual_text = _find_element(driver, self.locator).text
        return actual_text != self.text

#Load URL
driver = webdriver.Firefox()
driver.get(url)

#Load DataFrame of terms to search for
df = pd.read_csv("searchkey.csv")

#Crawling function    
def crawl(searchkey):
    try: 
        text_before = driver.find_element_by_class_name("ac_results").text 
    except NoSuchElementException: 
        text_before = ""

    searchbox = driver.find_element_by_name("searchbox")
    searchbox.clear()
    searchbox.send_keys(searchkey)
    print "\nSearching for %s ..." % searchkey

    WebDriverWait(driver, 10).until(
        text_to_change((By.CLASS_NAME, "ac_results"), text_before)
    )

    search_result = driver.find_element_by_class_name("ac_results")
    if search_result.text != "none":
        names = re.match(r"^.*(?=(\())", search_result.text).group().encode("utf-8")
        insrefs = re.findall(r"((?<=\()[0-9]*)", search_result.text)
    if search_result.text == "none":
        names = re.match(r"^.*(?=(\())", search_result.text)
        insrefs = re.findall(r"((?<=\()[0-9]*)", search_result.text)
    return pd.Series([insrefs, names])

#Run crawl    
df[["Insref", "Name"]] = df["ISIN"].apply(crawl)

#Print DataFrame    
print df
like image 50
alecxe Avatar answered Nov 16 '22 19:11

alecxe


I suggest using the below Expected Condition in WebDriverWait.

WebDriverWait(driver, 10).until(
    text_to_be_present_in_element((By.CLASS_NAME, "searchbox"), r"((?<=\()[0-9]*)")
)

or

WebDriverWait(driver, 10).until(
    text_to_be_present_in_element_value((By.CLASS_NAME, "searchbox"), r"((?<=\()[0-9]*)")
)
like image 33
Manu Avatar answered Oct 05 '22 14:10

Manu