Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove useless paragraph tags from string

Tags:

regex

php

If I have a string like:

<p>&nbsp;</p>
<p></p>
<p class="a"><br /></p>
<p class="b">&nbsp;</p>
<p>blah blah blah this is some real content</p>
<p>&nbsp;</p>
<p></p>
<p class="a"><br /></p>

How can I turn it into just:

<p>blah blah blah this is some real content</p>

The regex needs to pick up &nbsp;s and spaces.

like image 478
user90501 Avatar asked Nov 30 '22 12:11

user90501


1 Answers

$result = preg_replace('#<p[^>]*>(\s|&nbsp;?)*</p>#', '', $input);

This doesn't catch literal nbsp characters in the output, but that's very rare to see.

Since you're dealing with HTML, if this is user-input I might suggest using HTML Purifier, which will also deal with XSS vulnerabilities. The configuration setting you want there to remove empty p tags is %AutoFormat.RemoveEmpty.

like image 76
Edward Z. Yang Avatar answered Dec 10 '22 21:12

Edward Z. Yang