HTML snippet #1
<div>
</div>
<div>
<h1>headline</h1>
</div>
HTML snippet #2
<div></div>
<div><h1>headline</h1></div>
PHP code
$doc = new DOMDocument();
$doc->loadHTML($x);
$xpath = new DOMXpath($doc);
$divs = $xpath->query("//div");
foreach ($divs as $div) echo $div->childNodes->length,"<br />";
Output with $x =
snippet #1
1
3
Output with $x =
snippet #2
0
1
see working demo: http://codepad.viper-7.com/11BGge
My questions
1. How can this be?
2. How to count child nodes correctly with DOM
?
EDIT:
as Silkfire said, empty space is considered a text node. I set
$doc->preserveWhiteSpace = false;
but the results are still the same: http://codepad.viper-7.com/bnG5io
Any ideas?
Just count non-text nodes in your loop:
$count = 0;
foreach($div->childNodes as $node)
if(!($node instanceof \DomText))
$count++;
print $count;
Using xpath:
$nodesFromDiv1 = $xpath->query("//div[1]/*")->length;
$nodesFromDiv2 = $xpath->query("//div[2]/*")->length;
To remove empty text nodes, when preserveWhiteSpace=false
is not working (as I suggested in the chat):
$textNodes = $xpath->query('//text()');
foreach($textNodes as $node)
if(trim($node->wholeText) === '')
$node->parentNode->removeChild($node);
Whitespace is considered a node because it is a text() node (DOMText
).
You can make this work by changing your foreach
loop:
foreach ($divs as $div) {
echo $div->childNodes->length - $xpath->query('./text()', $div)->length, '<br>';
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With