Is there any way to remove all empty elements from an html without using regex?
I did this with DOMXPath
$this->dom->loadHTML($document, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new \DOMXPath($this->dom);
while (($node_list = $xpath->query('//*[not(*) and not(@*) and not(text()[normalize-space()])]')) && $node_list->length) {
foreach ($node_list as $node) {
$node->parentNode->removeChild($node);
}
}
Since it may be quite unclear, that the question has been already answered by the author (through editing the post) and I can't comment to ask for appropriate question closure, copying the same code as an actual answer.
An important thing: comments refer to another topic, but the solution there works only for flat documents, while the solution from OP does work with deep trees. It helped me quite a bunch.
$this->dom->loadHTML($document, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new \DOMXPath($this->dom);
while (($node_list = $xpath->query('//*[not(*) and not(@*) and not(text()[normalize-space()])]')) && $node_list->length) {
foreach ($node_list as $node) {
$node->parentNode->removeChild($node);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With