<pre class="prettyprint"><code><div id="a">This is some <div id="b">text</div> </div> </code></pre> <p>Getting "This is some" is non-trivial. For instance, this returns "This is some text":</p> <pre class="prettyprint"><code>driver.find_element_by_id('a').text </code></pre> <p>How does one, in a general way, get the text of a specific element without including the text of it's children?</p> <p>(I'm providing an answer below but will leave the question open in case someone can come up with a less hideous solution).</p>

<p>Here's a general solution:</p> <pre class="prettyprint"><code>def get_text_excluding_children(driver, element): return driver.execute_script(""" return jQuery(arguments[0]).contents().filter(function() { return this.nodeType == Node.TEXT_NODE; }).text(); """, element) </code></pre> <p>The element passed to the function can be something obtained from the <code>find_element...()</code> methods (i.e. it can be a <code>WebElement</code> object).</p> <p>Or if you don't have jQuery or don't want to use it you can replace the body of the function above above with this:</p> <pre class="prettyprint"><code>return self.driver.execute_script(""" var parent = arguments[0]; var child = parent.firstChild; var ret = ""; while(child) { if (child.nodeType === Node.TEXT_NODE) ret += child.textContent; child = child.nextSibling; } return ret; """, element) </code></pre> <p>I'm actually using this code in a test suite.</p>

<p>In the HTML which you have shared:</p> <pre class="prettyprint"><code><div id="a">This is some <div id="b">text</div> </div> </code></pre> <p>The text <code>This is some</code> is within a text node. To depict the text node in a structured way:</p> <pre class="prettyprint"><code><div id="a"> This is some <div id="b">text</div> </div> </code></pre> <hr> <h3>This Usecase</h3> <p>To extract and print the text <strong><code>This is some</code></strong> from the <em>text node</em> using Selenium's python client you have 2 ways as follows:</p> <ul> <li> <p>Using <code>splitlines()</code>: You can identify the parent element i.e. <code><div id="a"></code>, extract the <code>innerHTML</code> and then use <code>splitlines()</code> as follows:</p> <ul> <li> <p>using <em>xpath</em>:</p> <pre class="prettyprint"><code>print(driver.find_element_by_xpath("//div[@id='a']").get_attribute("innerHTML").splitlines()[0]) </code></pre> </li> <li> <p>using <em>xpath</em>:</p> <pre class="prettyprint"><code>print(driver.find_element_by_css_selector("div#a").get_attribute("innerHTML").splitlines()[0]) </code></pre> </li> </ul> </li> <li> <p>Using <code>execute_script()</code>: You can also use the <code>execute_script()</code> method which can synchronously execute JavaScript in the current window/frame as follows:</p> <ul> <li> <p>using <em>xpath</em> and <em>firstChild</em>:</p> <pre class="prettyprint"><code>parent_element = driver.find_element_by_xpath("//div[@id='a']") print(driver.execute_script('return arguments[0].firstChild.textContent;', parent_element).strip()) </code></pre> </li> <li> <p>using <em>xpath</em> and <em>childNodes[n]</em>:</p> <pre class="prettyprint"><code>parent_element = driver.find_element_by_xpath("//div[@id='a']") print(driver.execute_script('return arguments[0].childNodes[1].textContent;', parent_element).strip()) </code></pre> </li> </ul> </li> </ul>

How to get text of an element in Selenium WebDriver, without including child element text?

Tags:

python

html

selenium

selenium-webdriver

<div id="a">This is some    <div id="b">text</div> </div>

Getting "This is some" is non-trivial. For instance, this returns "This is some text":

driver.find_element_by_id('a').text

How does one, in a general way, get the text of a specific element without including the text of it's children?

(I'm providing an answer below but will leave the question open in case someone can come up with a less hideous solution).

248

asked Sep 07 '12 21:09

josh

2 Answers

Here's a general solution:

def get_text_excluding_children(driver, element):     return driver.execute_script("""     return jQuery(arguments[0]).contents().filter(function() {         return this.nodeType == Node.TEXT_NODE;     }).text();     """, element)

The element passed to the function can be something obtained from the find_element...() methods (i.e. it can be a WebElement object).

Or if you don't have jQuery or don't want to use it you can replace the body of the function above above with this:

return self.driver.execute_script(""" var parent = arguments[0]; var child = parent.firstChild; var ret = ""; while(child) {     if (child.nodeType === Node.TEXT_NODE)         ret += child.textContent;     child = child.nextSibling; } return ret; """, element)

I'm actually using this code in a test suite.

answered Sep 19 '22 08:09

Louis

In the HTML which you have shared:

<div id="a">This is some    <div id="b">text</div> </div>

The text This is some is within a text node. To depict the text node in a structured way:

<div id="a">     This is some    <div id="b">text</div> </div>

This Usecase

To extract and print the text This is some from the text node using Selenium's python client you have 2 ways as follows:

Using splitlines(): You can identify the parent element i.e. <div id="a">, extract the innerHTML and then use splitlines() as follows:

using xpath:

print(driver.find_element_by_xpath("//div[@id='a']").get_attribute("innerHTML").splitlines()[0])

using xpath:

print(driver.find_element_by_css_selector("div#a").get_attribute("innerHTML").splitlines()[0])

Using execute_script(): You can also use the execute_script() method which can synchronously execute JavaScript in the current window/frame as follows:

using xpath and firstChild:

parent_element = driver.find_element_by_xpath("//div[@id='a']") print(driver.execute_script('return arguments[0].firstChild.textContent;', parent_element).strip())

using xpath and childNodes[n]:

parent_element = driver.find_element_by_xpath("//div[@id='a']") print(driver.execute_script('return arguments[0].childNodes[1].textContent;', parent_element).strip())

answered Sep 22 '22 08:09

undetected Selenium

Related questions
                            
                                Install mysql-python (Windows)
                            
                                Shorter, more pythonic way of writing an if statement
                            
                                What do you wish you'd known about when you started learning Python? [closed]
                            
                                Swapping 1 with 0 and 0 with 1 in a Pythonic way
                            
                                Is this the fastest way to group in Pandas?
                            
                                What does "the following packages will be superseded by a higher priority channel" mean?
                            
                                Asyncio vs. Gevent [closed]
                            
                                How to install python package with a different name using PIP
                            
                                How do you call Python code from C code?
                            
                                Why is numpy.any so slow over large arrays?
                            
                                How to print current logging configuration used by the python logging module?
                            
                                "outsourcing" exception-handling to a decorator [closed]
                            
                                RAW Image processing in Python [closed]
                            
                                Is it safe to yield from within a "with" block in Python (and why)?
                            
                                Why does an empty string in Python sometimes take up 49 bytes and sometimes 51?
                            
                                How can I get all rows with keys provided in a list using SQLalchemy?
                            
                                NLTK Named Entity Recognition with Custom Data
                            
                                Python extension methods
                            
                                How do I add two integers together with Twisted?
                            
                                Converting to and from Hindu calendar

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With