Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I execute XPath queries on DOMElements using PHP?

I'm trying to do Xpath queries on DOMElements but it doesn't seem to work. Here is the code

<html>
    <div class="test aaa">
        <div></div>
        <div class="link">contains a link</div>
        <div></div>
    </div>
    <div class="test bbb">
        <div></div>
        <div></div>
        <div class="link">contains a link</div>
    </div>
</html>

What I'm doing is this:

$dom = new DOMDocument();
$html = file_get_contents("file.html");
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$entries = $xpath->query("//div[contains(@class,'test')]");
if (!$entries->length > 0) {
    echo "Nothing\n";
} else {
  foreach ($entries as $entry) {
      $link = $xpath->query('/div[@class=link]',$entry);
      echo $link->item(0)->nodeValue;
      // => PHP Notice:  Trying to get property of non-object
  }
}

Everything works fine up to $xpath->query('/div[@class=link], $entry);. I don't know how to use Xpath on a particular DOMElement ($entry).

How can I use xpath queries on DOMElement?

like image 740
Max Avatar asked May 08 '11 18:05

Max


People also ask

What is XPath in PHP?

PHP xpath() Function The xpath() function runs an XPath query on the XML document. This function returns an array of SimpleXMLElements on success, and FALSE of failure.

Is XPath a query language?

XPath (XML Path Language) is a query language that can be used to query data from XML documents. In RUEI, XPath queries can be used for content scanning of XML documents. A complete specification of XPath is available at http://www.w3.org/TR/xpath .


1 Answers

It looks like you're trying to mix CSS selectors with XPath. You want to be using a predicate ([...]) looking at the value of the class attribute.

For example, your //div.link might look like //div[contains(concat(' ',normalize-space(@class),' '),' link ')].

Secondly, within the loop you try to make a query with a context node then ignore that by using an absolute location path (it starts with a slash).

Updated to reflect changes to the question:

Your second XPath expression (/div[@class=link]) is still a) absolute, and b) has an incorrect condition. You want to be asking for matching elements relative to the specified context node ($entry) with the class attribute having a string value of link.

So /div[@class=link] should become something like div[@class="link"], which searches children of the $entry elements (use .//div[...] or descendant::div[...] if you want to search deeper).

like image 155
salathe Avatar answered Nov 04 '22 12:11

salathe