I am new to Regex, however I decided it was the easiest route to what I needed to do. Basically I have a string (in PHP) which contains a whole load of HTML code... I want to remove any tags which have style=display:none...
so for example
<img src="" style="display:none" />
<img src="" style="width:11px;display: none" >
etc...
So far my Regex is:
<img.*style=.*display.*:.*none;.* >
But that seems to leave bits of html behind and also take the next element away when used in php with preg_replace.
Like Michael pointed out, you don't want to use Regex for this purpose. A Regex does not know what an element tag is. <foo> is as meaningful as >foo< unless you teach it the difference. Teaching the difference is incredibly tedious though.
DOM is so much more convenient:
$html = <<< HTML
<img src="" style="display:none" />
<IMG src="" style="width:11px;display: none" >
<img src="" style="width:11px" >
HTML;
The above is our (invalid) markup. We feed it to DOM like this:
$dom = new DOMDocument();
$dom->loadHtml($html);
$dom->normalizeDocument();
Now we query the DOM for all "IMG" elements containing a "style" attribute that contains the text "display". We could query for "display: none" in the XPath, but our input markup has occurences with no space inbetween:
$xpath = new DOMXPath($dom);
foreach($xpath->query('//img[contains(@style, "display")]') as $node) {
$style = str_replace(' ', '', $node->getAttribute('style'));
if(strpos($style, 'display:none') !== FALSE) {
$node->parentNode->removeChild($node);
}
}
We iterate over the IMG nodes and remove all whitespace from their style attribute content. Then we check if it contains "display:none" and if so, remove the element from the DOM.
Now we only need to save our HTML:
echo $dom->saveHTML();
gives us:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><img src="" style="width:11px"></body></html>
Screw Regex!
Addendum: you might also be interested in Parsing XML documents with CSS selectors
$html = preg_replace("/<img[^>]+style[^>]+none[^>]+>/", '', $html);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With