Please note: A more refined version of this question, with an appropriate answer can be found here.
I would like to use the Selenium Python bindings to find elements with a given text on a web page. For example, suppose I have the following HTML:
<html>
<head>...</head>
<body>
<someElement>This can be found</someElement>
<someOtherElement>This can <em>not</em> be found</someOtherElement>
</body>
</html>
I need to search by text and am able to find <someElement>
using the following XPath:
//*[contains(text(), 'This can be found')]
I am looking for a similar XPath that lets me find <someOtherElement>
using the plain text "This can not be found"
. The following does not work:
//*[contains(text(), 'This can not be found')]
I understand that this is because of the nested em
element that "disrupts" the text flow of "This can not be found". Is it possible via XPaths to, in a way, ignore such or similar nestings as the one above?
You can use //*[contains(., 'This can not be found')]
.
The context node .
will be converted to its string representation before comparison to 'This can not be found'.
Be careful though since you are using //*
, so it will match ALL englobing elements that contain this string.
In your example case, it will match:
<someOtherElement>
<body>
<html>
!You could restrict this by targeting specific element tags or specific section in your document (a <table>
or <div>
with a known id or class)
Edit for the OP's question in comment on how to find the most nested elements matching the text condition:
The accepted answer here suggests //*[count(ancestor::*) = max(//*/count(ancestor::*))]
to select the most nested element. I think it's only XPath 2.0.
When combined with your substring condition, I was able to test it here with this document
<html>
<head>...</head>
<body>
<someElement>This can be found</someElement>
<nested>
<someOtherElement>This can <em>not</em> be found most nested</someOtherElement>
</nested>
<someOtherElement>This can <em>not</em> be found</someOtherElement>
</body>
</html>
and with this XPath 2.0 expression
//*[contains(., 'This can not be found')]
[count(ancestor::*) = max(//*/count(./*[contains(., 'This can not be found')]/ancestor::*))]
And it matches the element containing "This can not be found most nested".
There probably is a more elegant way to do that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With