Is there a way to do an xpath query on a DOMNode? Or at least convert it to a DOMXPath?
<html> ... <div id="content"> ... <div class="listing"> ... <div></div> <div></div> <div class='foo'> <h3>Get me 1</h3> <a>and me too 1</a> </div> </div> <div class="listing"> ... <div></div> <div></div> <div class='foo'> <h3>Get me 2</h3> <a>and me too 1</a> </div> </div> .... </div> </html>
This is my code. I am trying to get a list of array that has the values of the h3 and a tags in each array. To do that, I needed to get each listing, and then get the h3 and a tag's value in each listing.
$html_dom = new DOMDocument(); @$html_dom->loadHTML($html); $x_path = new DOMXPath($html_dom); $nodes= $x_path->query("//div[@id='content']//div[@class='listing']"); foreach ($nodes as $node) { // I want to further dig down here using query on a DOMNode }
The DOMXPath::query() function is an inbuilt function in PHP which is used to evaluate the given XPath expression. Syntax: DOMNodeList DOMXPath::query( string $expression, DOMNode $contextnode, bool $registerNodeNS )
Note that HTML and XML have a very similar structure, which is why XPath can be used almost interchangeably to navigate both HTML and XML documents.
XPath stands for XML Path Language. It uses a non-XML syntax to provide a flexible way of addressing (pointing to) different parts of an XML document. It can also be used to test addressed nodes within a document to determine whether they match a pattern or not.
Pass the node as the second argument to DOMXPath::query
contextnode: The optional contextnode can be specified for doing relative XPath queries. By default, the queries are relative to the root element.
Example:
foreach ($nodes as $node) { foreach ($x_path->query('h3|a', $node) as $child) { echo $child->nodeValue, PHP_EOL; } }
This uses the UNION operator for a result of
Get me 1 and me too 1 Get me 2 and me too 1
If you don't need any complex querying, you can also do
foreach ($nodes as $node) { foreach ($node->getElementsByTagName('a') as $a) { echo $a->nodeValue, PHP_EOL; } }
Or even by iterating the child nodes (note that this includes all the text nodes)
foreach ($nodes as $node) { foreach ($node->childNodes as $child) { echo $child->nodeName, PHP_EOL; } }
However, all of that is unneeded since you can fetch these nodes directly:
$nodes= $x_path->query("/html/body//div[@class='listing']/div[last()]"); foreach ($nodes as $i => $node) { echo $i, $node->nodeValue, PHP_EOL; }
will give you two nodes in the last div child of all the divs with a class attribute value of listing and output the combined text node values, including whitespace
0 Get me 1 and me too 1 1 Get me 2 and me too 1
Likewise, the following
"//div[@class='listing']/div[last()]/node()[name() = 'h3' or name() = 'a']"
will give you the four child H3 and A nodes and output
0Get me 1 1and me too 1 2Get me 2 3and me too 1
If you need to differentiate these by name while iterating over them, you can do
foreach ($nodes as $i => $node) { echo $i, $node->nodeName, $node->nodeValue, PHP_EOL; }
which will then give
0h3Get me 1 1aand me too 1 2h3Get me 2 3aand me too 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With