Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove redundant <br /> tags from HTML code using PHP?

I'm parsing some messy HTML code with PHP in which there are some redundant
tags and I would like to clean them up a bit. For instance:

<br>

<br /><br /> 


<br>

How would I replace something like that with this using preg_replace()?:

<br /><br />

Newlines, spaces, and the differences between <br>, <br/>, and <br /> would all have to be accounted for.

Edit: Basically I'd like to replace every instance of three or more successive breaks with just two.

like image 328
delaccount992 Avatar asked Dec 10 '22 08:12

delaccount992


2 Answers

Here is something you can use. The first line finds whenever there is 2 or more <br> tags (with whitespace between and different types) and replace them with wellformated <br /><br />.

I also included the second line to clean up the rest of the <br> tags if you want that too.

function clean($txt)
{
    $txt=preg_replace("{(<br[\\s]*(>|\/>)\s*){2,}}i", "<br /><br />", $txt);
    $txt=preg_replace("{(<br[\\s]*(>|\/>)\s*)}i", "<br />", $txt);
    return $txt;
}
like image 71
H9kDroid Avatar answered Dec 28 '22 07:12

H9kDroid


This should work, using minimum specifier:

preg_replace('/(<br[\s]?[\/]?>[\s]*){3,}/', '<br /><br />', $multibreaks);

Should match appalling <br><br /><br/><br> constructions too.

like image 36
Karl Andrew Avatar answered Dec 28 '22 07:12

Karl Andrew