I am crawling the web for html, and when I use php strip_tags it smushes the entire html into one line removing all structure.
I would like to preserve structure, by replacing closing h, p and br tags with newlines.
Would a preg replace be the best solution for this?
Once I replaced all closing tags I would run a strip tags but this way I would have a basic structure.
$str = 'some html';
$tags = array('</p>','<br />','<br>','<hr />','<hr>','</h1>','</h2>','</h3>','</h4>','</h5>','</h6>');
$str = str_replace($tags,"\n",$str);
// then strip tags
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With