I want to dynamically remove specific tags and their content from an html file and thought of using preg_replace but can't get the syntax right. Basically it should, for example, do something like : Replace everything between (and including) "" by nothing.
Could anybody help me out on this please ?
Easy dude.
To have a Ungreedy regexpr, use the U modifier And to make it multiline, use the s modifier. Knowing that, to remove all paragraphes use this pattern :
#<p[^>]*>(.*)?</p>#sU
Explain :
<p[^>]*>
: part detecting an opening paragraph (with a hypothetic style, such as )(.*)?
: Everything (in "Ungreedy mode")</p>
: Obviously, the closing paragraphHope that help !
If you are trying to sanitize your data, it is often recommended that you use a whitelist as opposed to blacklisting certain terms and tags. This is easier to sanitize and prevent XSS attacks. There's a well known library called HTML Purifier that, although large and somewhat slow, has amazing results regarding purifying your data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With