Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all empty html elements using PHP: DOMDocument

Is there any way to remove all empty elements from an html without using regex?

I did this with DOMXPath

$this->dom->loadHTML($document, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new \DOMXPath($this->dom);
while (($node_list = $xpath->query('//*[not(*) and not(@*) and not(text()[normalize-space()])]')) && $node_list->length) {
    foreach ($node_list as $node) {
        $node->parentNode->removeChild($node);
    }
}
like image 259
marnaels Avatar asked Jun 21 '26 15:06

marnaels


1 Answers

Since it may be quite unclear, that the question has been already answered by the author (through editing the post) and I can't comment to ask for appropriate question closure, copying the same code as an actual answer.
An important thing: comments refer to another topic, but the solution there works only for flat documents, while the solution from OP does work with deep trees. It helped me quite a bunch.

$this->dom->loadHTML($document, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new \DOMXPath($this->dom);
while (($node_list = $xpath->query('//*[not(*) and not(@*) and not(text()[normalize-space()])]')) && $node_list->length) {
    foreach ($node_list as $node) {
        $node->parentNode->removeChild($node);
    }
}
like image 160
Simbiat Avatar answered Jun 24 '26 15:06

Simbiat