Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get text of an element in Selenium WebDriver, without including child element text?

<div id="a">This is some    <div id="b">text</div> </div> 

Getting "This is some" is non-trivial. For instance, this returns "This is some text":

driver.find_element_by_id('a').text 

How does one, in a general way, get the text of a specific element without including the text of it's children?

(I'm providing an answer below but will leave the question open in case someone can come up with a less hideous solution).

like image 248
josh Avatar asked Sep 07 '12 21:09

josh


People also ask

How do you getText from an element in Selenium?

The Selenium WebDriver interface has predefined the getText() method, which helps retrieve the text for a specific web element. This method gets the visible, inner text (which is not hidden by CSS) of the web-element.

How do I getText from Webelement?

We can get text from a webelement with Selenium webdriver. The getText() methods obtains the innerText of an element. It fetches the text of an element which is visible along with its sub elements. It ignores the trailing and leading spaces.

How do I find text in an element?

text() and contains methods text(): A built-in method in Selenium WebDriver that is used with XPath locator to locate an element based on its exact text value. contains(): Similar to the text() method, contains() is another built-in method used to locate an element based on partial text match.


2 Answers

Here's a general solution:

def get_text_excluding_children(driver, element):     return driver.execute_script("""     return jQuery(arguments[0]).contents().filter(function() {         return this.nodeType == Node.TEXT_NODE;     }).text();     """, element) 

The element passed to the function can be something obtained from the find_element...() methods (i.e. it can be a WebElement object).

Or if you don't have jQuery or don't want to use it you can replace the body of the function above above with this:

return self.driver.execute_script(""" var parent = arguments[0]; var child = parent.firstChild; var ret = ""; while(child) {     if (child.nodeType === Node.TEXT_NODE)         ret += child.textContent;     child = child.nextSibling; } return ret; """, element)  

I'm actually using this code in a test suite.

like image 76
Louis Avatar answered Sep 19 '22 08:09

Louis


In the HTML which you have shared:

<div id="a">This is some    <div id="b">text</div> </div> 

The text This is some is within a text node. To depict the text node in a structured way:

<div id="a">     This is some    <div id="b">text</div> </div> 

This Usecase

To extract and print the text This is some from the text node using Selenium's python client you have 2 ways as follows:

  • Using splitlines(): You can identify the parent element i.e. <div id="a">, extract the innerHTML and then use splitlines() as follows:

    • using xpath:

      print(driver.find_element_by_xpath("//div[@id='a']").get_attribute("innerHTML").splitlines()[0]) 
    • using xpath:

      print(driver.find_element_by_css_selector("div#a").get_attribute("innerHTML").splitlines()[0]) 
  • Using execute_script(): You can also use the execute_script() method which can synchronously execute JavaScript in the current window/frame as follows:

    • using xpath and firstChild:

      parent_element = driver.find_element_by_xpath("//div[@id='a']") print(driver.execute_script('return arguments[0].firstChild.textContent;', parent_element).strip()) 
    • using xpath and childNodes[n]:

      parent_element = driver.find_element_by_xpath("//div[@id='a']") print(driver.execute_script('return arguments[0].childNodes[1].textContent;', parent_element).strip()) 
like image 25
undetected Selenium Avatar answered Sep 22 '22 08:09

undetected Selenium