Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Xpath selecting ancestors

I am trying to find a formula that creates a URL for an element based on its position in the XML hierarchy.

This is my sample xml:

<Xml>
    <Site Url="http://mysite.abc">
        <Content></Content>
        <SubSites>
            <Site Url="/sub1">
                <Content></Content>
                <SubSites>
                    <Site Url="/sub2">
                        <Content></Content>
                        <SubSites>
                            <Site Url="/sub3">
                                <Content></Content>
                            </Site>
                        </SubSites>
                    </Site>
                </SubSites>
            </Site>
        </SubSites>
    </Site>
</Xml>

I have a function in Powershell that recursively iterates down from the top and on each 'Content' element I want to generate a concatenation of the ancestors Url values. So it should generate consecutively for each 'Content' Node:

http://mysite.abc
http://mysite.abc/sub1
http://mysite.abc/sub1/sub2
http://mysite.abc/sub1/sub2/sub3

I use at the moment as a start: ( $Node = the 'Content' element )

$Sites = $Node | Select-XML -XPath  "//ancestor::Site"

But for every $Node it selects all the 'Site' elements. It would expect it to find more ancestors while going down in the xml structure.

If someone would know how to concatenate the values directly with Xpath that would be especially great, but for starters, I would be happy to know what is going wrong with my current approach.

like image 421
oysterhoys Avatar asked Mar 10 '18 10:03

oysterhoys


People also ask

What is the difference between preceding and ancestor in XPath?

The preceding axis selects all nodes that come before the current node in the document, except ancestor, attribute nodes, and namespace nodes.

How can I reach sibling in XPath?

We can use the XPath following sibling axis to find this. So, for this scenario, the XPath expression will be. And we need to identify its sibling “div ” element, as shown below. However, if numerous siblings have the same node, XPath will recognise all of the different elements.

Where is descendant in XPath?

Xpath Descendant is defined as a context node that is represented by the descendant axis; a descendant is a child node, a child of a child, and so on; consequently, the descendant axis doesn't contain attribute or namespace nodes. XPath is a mini-language that describes a node pattern to select a set of nodes.


2 Answers

To offer an alternative to Mathias R. Jessen's helpful answer (which explains the problem with your approach well and offers an effective solution):

Since the Site nodes seem to always be the parent node of any given Content node, you can simply refer to the respective Site node with an .. path component.

This approach allows you to process the entire document at once:

Select-Xml -LiteralPath sample.xml -XPath  "//Content/.." | ForEach-Object -Begin {
    $ancestralUrl = ''
  } -Process {
    $thisUrl = $_.Node.Url
    if ($thisUrl -match '^https?://') {
      $ancestralUrl = $thisUrl
    } else {
      $thisUrl = $ancestralUrl += $thisUrl
    }
    $thisUrl
  }

The above yields:

http://mysite.abc
http://mysite.abc/sub1
http://mysite.abc/sub1/sub2
http://mysite.abc/sub1/sub2/sub3

In fact, you can even combine the above approach with the ancestor function (though it would be overkill here):

Select-Xml -LiteralPath sample.xml '//Content/ancestor::Site' | ForEach-Object -Begin {
  $ancestralUrl = ''
} -Process {
  $thisUrl = $_.Node.Url
  if ($thisUrl -match '^https?://') {
    $ancestralUrl = $thisUrl
  } else {
    $thisUrl = $ancestralUrl += $thisUrl
  }
  $thisUrl
}
like image 31
mklement0 Avatar answered Oct 10 '22 19:10

mklement0


//ancestor::Site will give you the ancestral Site node relative to any node (//) in the tree.

Use ./ancestor::Site to grab only the ancestor relative to the current node (.):

$Sites = $Node | Select-XML -XPath  "./ancestor::Site"
like image 71
Mathias R. Jessen Avatar answered Oct 10 '22 20:10

Mathias R. Jessen