Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to iterate over childNodes using foreach loop?

Tags:

dom

php

xpath

Consider the following PHP code

<?php

 $html_data = 
 '<html><body>
  <ol>
  <li><strong>Question 1</strong> Answer1</li>
  <li><strong>Question 2</strong> Answer2</li>
  </ol></body></html>';

  $doc = new DOMDocument();
  $doc->loadHTML($html_data);
  $xpath = new DOMXPath($doc);

  $ols = $xpath->query('//ol');
  $ol = $ols->item(0);
  $lis = $ol->childNodes;

  foreach ($lis as $li) {
    echo $li->firstChild->nodeValue."<br />";
    echo $li->lastChild->nodeValue."<br />";
    //echo $li->childNodes->item(0)->nodeValue."<br />";
  }
  ?>

If I remove the comment on the last line of this code and access the childNodes DOM Object Array, my foreach loop executes only once. However, if I access the same elements using firstChild and lastChild as shown above, I can successfully iterate over all the 'li' tags present.

I can't make any sense of this at all. Is this a bug in PHP?

like image 835
Gowtham Avatar asked Jan 27 '13 19:01

Gowtham


People also ask

How do you iterate over a child's node?

To iterate over Children of HTML Element in JavaScript, get the reference to this HTML Element, get children of this HTML using using children property, then use for loop to iterate over the children.

What does childNodes mean?

Child nodes include elements, text and comments. Note: The NodeList being live means that its content is changed each time new children are added or removed. The items in the collection of nodes are objects, not strings. To get data from node objects, use their properties.

What is the difference between childNodes and children in JavaScript?

childNodes vs childrenchildNodes returns child nodes (element nodes, text nodes, and comment nodes). children returns child elements (not text and comment nodes).

What does childNodes return?

childNodes returns nodes: Element nodes, text nodes, and comment nodes. Whitespace between elements are also text nodes.


2 Answers

If you wouldn't suprress your error reporting, you would have seen that you have a fatal error that breaks your script.

In order to work with the item method:

foreach ($lis as $li) {
  if (method_exists($li->childNodes, 'item')) {
    echo $li->childNodes->item(0)->nodeValue."<br />";
    // To reproduce the exact output you need this line also. 
    // You need to display the second child (Answer)
    echo $li->childNodes->item(1)->nodeValue."<br />";
  }  
}

The only difference it was that the first script

foreach ($lis as $li) {
  echo $li->firstChild->nodeValue."<br />";
  echo $li->lastChild->nodeValue."<br />";    
  //echo $li->childNodes->item(0)->nodeValue."<br />";
}

Only throws Notice: Trying to get property of non-object, but the scripts continues.

As with method item() it throws a fatal error. (Fatal error: Call to a member function item() on a non-object). which kills your script.

For more details on how you should iterate on these nodesList (foreach vs. for) read the comments from these pages

  • http://www.php.net/manual/en/class.domnodelist.php
  • http://www.php.net/manual/en/domnodelist.item.php

And you especially have this issue because of the trailing space after the <li> tags.

It loops like this: first <li> tag, then the space ' ' DOMText element then the second <li> tag then the second ' ' DOMText element.

On the DOMText element it crashes. You could clear the spaces and it would work.

$html_data = '<html><body><ol><li><strong>Question 1</strong> Answer1</li><li><strong>Question 2</strong> Answer2</li></ol></body></html>';
like image 97
user1236048 Avatar answered Nov 01 '22 18:11

user1236048


I tried to reproduce your problem (on PHP 5.3.14) with the following code:

Interactive shell

php > $xml = <<<XML
<<< > <root>
<<< > <ol>
<<< > <li><strong>Question 1</strong> Answer1</li>
<<< > <li><strong>Question 2</strong> Answer2</li>
<<< > </ol>
<<< > </root>
<<< > XML;
php > $doc = new DOMDocument();
php > $doc->loadXML($xml);
php > $xpath = new DOMXPath($doc);
php > $ols = $xpath->query('//ol');
php > $ol = $ols->item(0);
php > $lis = $xpath->query('//li', $ol);
php > foreach ($lis as $li) {
php { echo $li->firstChild->nodeValue."<br />";
php { echo $li->lastChild->nodeValue."<br />";
php { echo $li->childNodes->item(0)->nodeValue."<br />";
php { }
Question 1<br /> Answer1<br />
Question 1<br />
Question 2<br /> Answer2<br />
Question 2<br />

As you see, I did not succeed, everything works fine. The only thing I changed was $lis = $ol->childNodes; to $lis = $xpath->query('//li', $ol); because otherwise I got whitespace text nodes between the <li> nodes and the script crashed.

like image 25
Fabian Schmengler Avatar answered Nov 01 '22 16:11

Fabian Schmengler