I have some HTML and the requirement is to remove only starting <p>
tags from the string.
Example:
input: <p style="display:inline; margin: 40pt;"><span style="font:XXXX;"> Text1 Here</span></p><p style="margin: 50pt"><span style="font:XXXX">Text2 Here</span></p> <p style="display:inline; margin: 40pt;"><span style="font:XXXX;"> Text3 Here</span></p>the string goes on like that
desired output: <span style="font:XXXX;"> Text1 Here</span></p><span style="font:XXXX">Text2 Here</span></p><span style="font:XXXX;"> Text3 Here</span></p>
Is it possible using Regex? I have tried some combinations but not working. This is all a single string. Any advice appreciated.
I'm sure you know the warnings about using regex to match html. With these disclaimers, you can do this:
Option 1: Leaving the closing </p>
tags
This first option leaves the closing </p>
tags, but that's what your desired output shows. :) Option 2 will remove them as well.
PHP
$replaced = preg_replace('~<p[^>]*>~', '', $yourstring);
JavaScript
replaced = yourstring.replace(/<p[^>]*>/g, "");
Python
replaced = re.sub("<p[^>]*>", "", yourstring)
<p
matches the beginning of the tag[^>]*
matches any character that is not a closing >
>
closes the matchOption 2: Also removing the closing </p>
tags
PHP
$replaced = preg_replace('~</?p[^>]*>~', '', $yourstring);
JavaScript
replaced = yourstring.replace(/<\/?p[^>]*>/g, "");
Python
replaced = re.sub("</?p[^>]*>", "", yourstring)
This is a PCRE expression:
/<p( *\w+=("[^"]*"|'[^']'|[^ >]))*>(.*<\/p>)/Ug
Replace each occurrence with $3 or just remove all occurrences of:
/<p( *\w+=("[^"]*"|'[^']'|[^ >]))*>/g
If you want to remove the closing tag as well:
/<p( *\w+=("[^"]*"|'[^']'|[^ >]))*>(.*)<\/p>/Ug
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With