Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove <p> tags - Regular Expression (Regex)

Tags:

html

regex

I have some HTML and the requirement is to remove only starting <p> tags from the string.

Example:

input: <p style="display:inline; margin: 40pt;"><span style="font:XXXX;"> Text1 Here</span></p><p style="margin: 50pt"><span style="font:XXXX">Text2 Here</span></p> <p style="display:inline; margin: 40pt;"><span style="font:XXXX;"> Text3 Here</span></p>the string goes on like that

desired output: <span style="font:XXXX;"> Text1 Here</span></p><span style="font:XXXX">Text2 Here</span></p><span style="font:XXXX;"> Text3 Here</span></p>

Is it possible using Regex? I have tried some combinations but not working. This is all a single string. Any advice appreciated.

like image 456
n.nasa Avatar asked Jun 19 '14 08:06

n.nasa


2 Answers

I'm sure you know the warnings about using regex to match html. With these disclaimers, you can do this:

Option 1: Leaving the closing </p> tags

This first option leaves the closing </p> tags, but that's what your desired output shows. :) Option 2 will remove them as well.

PHP

$replaced = preg_replace('~<p[^>]*>~', '', $yourstring);

JavaScript

replaced = yourstring.replace(/<p[^>]*>/g, "");

Python

replaced = re.sub("<p[^>]*>", "", yourstring)
  • <p matches the beginning of the tag
  • The negative character class [^>]* matches any character that is not a closing >
  • > closes the match
  • we replace all this with an empty string

Option 2: Also removing the closing </p> tags

PHP

$replaced = preg_replace('~</?p[^>]*>~', '', $yourstring);

JavaScript

replaced = yourstring.replace(/<\/?p[^>]*>/g, "");

Python

replaced = re.sub("</?p[^>]*>", "", yourstring)
like image 193
zx81 Avatar answered Sep 27 '22 23:09

zx81


This is a PCRE expression:

/<p( *\w+=("[^"]*"|'[^']'|[^ >]))*>(.*<\/p>)/Ug

Replace each occurrence with $3 or just remove all occurrences of:

/<p( *\w+=("[^"]*"|'[^']'|[^ >]))*>/g

If you want to remove the closing tag as well:

/<p( *\w+=("[^"]*"|'[^']'|[^ >]))*>(.*)<\/p>/Ug
like image 28
Tobias Avatar answered Sep 27 '22 23:09

Tobias