Please note: This question is a more refined version of a previous question. I am looking for an XPath that lets me find elements with a given plain text in an HTML document. For example, suppose I have the following HTML: <pre class="prettyprint"><code><html> <head>...</head> <body> <someElement>This can be found</someElement> <nested> <someOtherElement>This can not be found most nested</someOtherElement> </nested> <yetAnotherElement>This can not be found</yetAnotherElement> </body> </html> </code></pre> I need to search by text and am able to find <code><someElement></code> using the following XPath: <pre class="prettyprint"><code>//*[contains(text(), 'This can be found')] </code></pre> I am looking for a similar XPath that lets me find <code><someOtherElement></code> and <code><yetAnotherElement></code> using the plain text <code>"This can not be found"</code>. The following does not work: <pre class="prettyprint"><code>//*[contains(text(), 'This can not be found')] </code></pre> I understand that this is because of the nested <code>em</code> element that "disrupts" the text flow of "This can not be found". Is it possible via XPaths to, in a way, ignore such or similar nestings as the one above?

You can use <pre class="prettyprint"><code>//*[contains(., 'This can not be found')] [not(.//*[contains(., 'This can not be found')])] </code></pre> This XPath consists of two parts: <ol> <li> <code>//*[contains(., 'This can not be found')]</code>: The operator <code>.</code> converts the context node to its string representation. This part therefore selects all nodes that contain 'This can not be found' in their string representation. In the above example, this is <code><someOtherElement></code>, <code><yetAnotherElement></code> and: <code><body></code> and <code><html></code>.</li> <li> <code>[not(.//*[contains(., 'This can not be found')])]</code>: This removes nodes with a child element that still contains the plain text 'This can not be found'. It removes the unwanted nodes <code><body></code> and <code><html></code> in the above example.</li> </ol> You can try these XPaths out here.

XPath: Find HTML element by plain text

Tags:

html

xpath

Please note: This question is a more refined version of a previous question.

I am looking for an XPath that lets me find elements with a given plain text in an HTML document. For example, suppose I have the following HTML:

<html>
<head>...</head>
<body>
    <someElement>This can be found</someElement>
    <nested>
        <someOtherElement>This can <em>not</em> be found most nested</someOtherElement>
    </nested>
    <yetAnotherElement>This can <em>not</em> be found</yetAnotherElement>
</body>
</html>

I need to search by text and am able to find <someElement> using the following XPath:

//*[contains(text(), 'This can be found')]

I am looking for a similar XPath that lets me find <someOtherElement> and <yetAnotherElement> using the plain text "This can not be found". The following does not work:

//*[contains(text(), 'This can not be found')]

I understand that this is because of the nested em element that "disrupts" the text flow of "This can not be found". Is it possible via XPaths to, in a way, ignore such or similar nestings as the one above?

387

asked Sep 09 '13 17:09

Michael Herrmann

1 Answers

You can use

//*[contains(., 'This can not be found')]
   [not(.//*[contains(., 'This can not be found')])]

This XPath consists of two parts:

//*[contains(., 'This can not be found')]: The operator . converts the context node to its string representation. This part therefore selects all nodes that contain 'This can not be found' in their string representation. In the above example, this is <someOtherElement>, <yetAnotherElement> and: <body> and <html>.
[not(.//*[contains(., 'This can not be found')])]: This removes nodes with a child element that still contains the plain text 'This can not be found'. It removes the unwanted nodes <body> and <html> in the above example.

You can try these XPaths out here.

121

answered Sep 28 '22 19:09

Michael Herrmann

Related questions
                            
                                div width 100% +10px: relative to parent?
                            
                                reveal.js background color choices
                            
                                date fields displayed incorrectly on iPhone
                            
                                Best practice for empty javascript event (e.g. onclick="javascript:;") [duplicate]
                            
                                Google Webfonts and Anti-aliasing
                            
                                Responsive design: How to resize images/background images and align divs vertically?
                            
                                css rem unit not working with font declarations
                            
                                How to put custom HTML code in header of wordpress websites [closed]
                            
                                Overflow-x: hidden and overflow-y: visible adds scrollbars
                            
                                100% Height div, scrollbars 50px off screen due to margin-top 50px. overflow auto
                            
                                Is it possible to have a 'permanent' placeholder?
                            
                                Create diamond overlay image using css [closed]
                            
                                HTML5 Canvas blinking on drawing
                            
                                Can one debug javascript on a samsung tablet's native browser
                            
                                HTML5 tag <meter> attributes
                            
                                Disabling HTML5 validation. How to set 'novalidate' for every form globally?
                            
                                How to localise html5 meta tag information
                            
                                White page when loading while using jQuery Mobile
                            
                                How to deal with line-height while coding a pixel perfect design
                            
                                Trouble with multi-level collapsible Bootstrap side-nav menu

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

XPath: Find HTML element by plain text

Tags:

html

xpath

Michael Herrmann

People also ask

1 Answers

Michael Herrmann

Recent Activity

Donate For Us