I'm using DOM to parse string. I need function that strips span tags and its contents. For example, if I have:
This is some text that contains photo.
<span class='title'> photobyile</span>
I would like function to return
This is some text that contains photo.
This is what I tried:
$dom = new domDocument;
$dom->loadHTML($string);
$dom->preserveWhiteSpace = false;
$spans = $dom->getElementsByTagName('span');
foreach($spans as $span)
{
$naslov = $span->nodeValue;
echo $naslov;
$string = preg_replace("/$naslov/", " ", $string);
}
I'm aware that $span->nodeValue
returns value of span tag and not whole tag, but I don't know how to get whole tag, together with class name.
Thanks, Ile
Definition and Usage The strip_tags() function strips a string from HTML, XML, and PHP tags. Note: HTML comments are always stripped. This cannot be changed with the allow parameter. Note: This function is binary-safe.
Try removing the spans directly from the DOM tree.
$dom = new DOMDocument();
$dom->loadHTML($string);
$dom->preserveWhiteSpace = false;
$elements = $dom->getElementsByTagName('span');
while($span = $elements->item(0)) {
$span->parentNode->removeChild($span);
}
echo $dom->saveHTML();
@ile - I've had that problem - it's because the index of the foreach iterator happily keeps incrementing, while calling removeChild() on the DOM also seems to remove the nodes from the DomNodeList ($spans). So for every span you remove, the nodelist shrinks one element and then gets its foreach counter incremented by one. Net result: it skips one span.
I'm sure there is a more elegant way, but this is how I did it - I moved the references from the DomNodeList to a second array, where they would not be removed by the removeChild() operation.
foreach($spans as $span) {
$nodes[] = $span;
}
foreach($nodes as $span) {
$span->parentNode->removeChild($span);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With