Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP DOM - counting child nodes?

Tags:

dom

php

HTML snippet #1

<div>
</div>
<div>
    <h1>headline</h1>
</div>

HTML snippet #2

<div></div>
<div><h1>headline</h1></div>

PHP code

$doc = new DOMDocument();
$doc->loadHTML($x);
$xpath = new DOMXpath($doc);
$divs = $xpath->query("//div");

foreach ($divs as $div) echo $div->childNodes->length,"<br />";

Output with $x = snippet #1
1
3

Output with $x = snippet #2
0
1

see working demo: http://codepad.viper-7.com/11BGge

My questions
1. How can this be?
2. How to count child nodes correctly with DOM?

EDIT:
as Silkfire said, empty space is considered a text node. I set

$doc->preserveWhiteSpace = false;

but the results are still the same: http://codepad.viper-7.com/bnG5io

Any ideas?

like image 615
michi Avatar asked May 09 '13 21:05

michi


2 Answers

Just count non-text nodes in your loop:

$count = 0;
foreach($div->childNodes as $node)    
  if(!($node instanceof \DomText))      
    $count++;

print $count;

Using xpath:

$nodesFromDiv1 = $xpath->query("//div[1]/*")->length;
$nodesFromDiv2 = $xpath->query("//div[2]/*")->length;

To remove empty text nodes, when preserveWhiteSpace=false is not working (as I suggested in the chat):

$textNodes = $xpath->query('//text()');

foreach($textNodes as $node)
  if(trim($node->wholeText) === '')
    $node->parentNode->removeChild($node);
like image 165
nice ass Avatar answered Nov 15 '22 08:11

nice ass


Whitespace is considered a node because it is a text() node (DOMText).

You can make this work by changing your foreach loop:

foreach ($divs as $div) {
    echo $div->childNodes->length - $xpath->query('./text()', $div)->length, '<br>';
}
like image 3
silkfire Avatar answered Nov 15 '22 08:11

silkfire