Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php regex to remove HTML

Tags:

html

regex

php

Before we start, strip_tags() doesn't work.

now,

I've got some data that needs to be parsed, the problem is, I need to get rid of all the HTML that has been formated very strangely. the tags look like this: (notice the spaces)

< p > blah blah blah < / p > < a href= " link.html " > blah blah blah < /a >

All the regexs I've been trying aren't working, and I don't know enough about regex formating to make them work. I don't care about preserving anything inside of the tags, and would prefer to get rid of the text inside a link if I could.

Anyone have any idea?

(I really need to just sit down and learn regular expressions one day)

like image 401
Me1000 Avatar asked Apr 17 '09 02:04

Me1000


2 Answers

Does

preg_replace('/<[^>]*>/', '', $content)

work?

like image 103
chaos Avatar answered Sep 20 '22 17:09

chaos


strip_tags() will work if you use html_entity_decode() on a variable before strip_tags()

<?php
$text = '< p > blah blah blah < / p > < a href= " link.html " > blah blah blah< /a >';
echo strip_tags(html_entity_decode($text));
?>
like image 34
Slobodan Avatar answered Sep 18 '22 17:09

Slobodan