Remove content between HTML tags in PHP?

Question

I would like to remove all content (between tags) from a HTML string. Is there an elegant way to do this without writing complex regex?

If you want, I am actually looking for the opposite of what strip_tags() does.

Suggestions?

Anders · Accepted Answer

This solution uses regex. I will let you decide if it is complex or not.

$out = preg_replace("/(?<=^|>).*?(?=<|$)/s", "", $in);

Let's break it down:

(?<=^|>): A lookbehind. Not actually matched, but it still has to be there. Matches either beginning of string (^) or literal >.
.*?: Matches anything (s modifier makes it include newline). The question mark makes it lazy - it matches as few characters as possible.
(?=<|$): A lookahead. Matches either literal < or end of string ($).

This is replaced by nothing (""), so that everything between > and < is deleted. A working demo can be seen here. It does not preserve whitespace, so you end up with one super long line.

EDIT: If you know that your input will always be wrapped in HTML-tags you can make it even simpler for yourself, since you don't have to think about the beginning and end of string bits:

$out = preg_replace("/>.*?</s", "><", $in);

This variant will not work for input with text at the beginning or the end - for instance Hello <b>World</b>! will become Hello<b></b>!.

Remove content between HTML tags in PHP?

Tags:

html

dom

php

gaekaete

1 Answers

Anders

Recent Activity

Donate For Us

Remove content between HTML tags in PHP?

Tags:

html

dom

php

gaekaete

1 Answers

Anders

Related questions

Recent Activity

Donate For Us