Is there a better way of getting parent node of XPath query result?

Question

Having markup like this:

<div class="foo">
   <div><span class="a1"></span><a href="...">...</a></div>
   <div><span class="a2"></span><a href="...">...</a></div>
   <div><span class="a1"></span>some text</div>
   <div><span class="a3"></span>some text</div>
</div>

I am interested in getting all <a> and some text ONLY if adjacent span is of class a1. So at the end of the whole code my result should be <a> from first div and some text from third one. It'd be easy if <a> and some text were inside span or div would have class attribute, but no luck.

What I am doing now is look for span with a1 class:

//div[contains(@class,'foo')]/div/span[contains(@class,'a1')]

then I get its parent and do another query() with that parent as context node. This simply looks far from being efficient so the question clearly is if there is any better way to accomplish my goal?

THE ANSWER ADDENDUM

As per @MarcB accepted answer, the right query to use is:

//div[contains(@class,'foo')]/div/span[contains(@class,'a1')]/..

but for <a> it may be better to use:

//div[contains(@class,'foo')]/div/span[contains(@class,'a1')]/../a

the get the <a> instead of its container.

Marc B · Accepted Answer

The nice thing about xpath queries is that you can essentially treat them like a file system path, so simply having

//div[contains(@class,'foo')]/div/span[contains(@class,'a1')]/..
                                                              ^^

will find all your .a1 nodes that are below a .foo node, then move up one level to the a1 nodes' parents.

Dimitre Novatchev · Answer

An expression that is better than using reverse axis:

//div[contains(@class,'foo')]/div[span[contains(@class,'a1')]]

This selects any div that is a child of a div whose class attribute contains the string "foo" and that (the selected div) has a span child whose class attribute contains the string "a1".

XSLT - based verification:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:copy-of select=
  "//div[contains(@class,'foo')]
          /div[span[contains(@class,'a1')]]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<div class="foo">
   <div><span class="a1"></span><a href="...">...</a></div>
   <div><span class="a2"></span><a href="...">...</a></div>
   <div><span class="a1"></span>some text</div>
   <div><span class="a3"></span>some text</div>
</div>

the XPath expression is evaluated and the selected elements are copied to the output:

<div>
   <span class="a1"/>
   <a href="...">...</a>
</div>
<div>
   <span class="a1"/>some text</div>

II. Remarks on accessing an Html element by one of its classes:

If it is known that the element can have only one class, then it isn't necessary at all to use contains()

Don't use:

//div[contains(@class, 'foo')]

Use:

//div[@class = 'foo']

or, if there could be leading/trailing spaces, use:

//div[normalize-space(@class) = 'foo']

A crucial issue with:

//div[contains(@class, 'foo')]

is that this selects any div with class such as "myfoo", "foo2" or "myfoo3".

If the element may have more than one class, and to avoid the above issue, the correct XPath expression is:

//div[contains(concat(' ', @class, ' '), ' foo ')]

Is there a better way of getting parent node of XPath query result?

Tags:

dom

xml

xpath

Marcin Orlowski

2 Answers

Marc B

Dimitre Novatchev

Recent Activity

Donate For Us

Is there a better way of getting parent node of XPath query result?

Tags:

dom

xml

xpath

Marcin Orlowski

2 Answers

Marc B

Dimitre Novatchev

Related questions

Recent Activity

Donate For Us