I'm using simpile_html_dom for getting html pages elements. I have some div elements like this. All i want is to get "Fine Thanks" sentence in each div (that is not inside any sub-element). How can i do it?
<div class="right">
<h2>
<a href="">Hello</a>
</h2>
<br/>
<span>How Are You?</span>
<span>How Are You?</span>
<span>How Are You?</span>
Fine Thanks
</div>
It should be simply $html->find('div.right > text')
, but that won't work because Simple HTML DOM Parser doesn't seem to support direct descendant queries.
So you'd have to find all <div>
elements first and search the child nodes for a text node. Unfortunately, the ->childNodes()
method is mapped to ->children()
and thus only returns elements.
A working solution is to call ->find('text')
on each <div>
element, after which you filter the results based on the parent node.
foreach ($doc->find('div.right') as $parent) {
foreach ($parent->find('text') as $node) {
if ($node->parent() === $parent && strlen($t = trim($node->plaintext))) {
echo $t, PHP_EOL;
}
}
}
Using DOMDocument
, this XPath expression will do the same work without the pain:
$doc = new DOMDocument;
$doc->loadHTML($content);
$xp = new DOMXPath($doc);
foreach ($xp->query('//div/text()') as $node) {
if (strlen($t = trim($node->textContent))) {
echo $t, PHP_EOL;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With