Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strip Tags and everything in between

Tags:

php

How can i strip <h1>including this content</h1>

I know you can use strip tags to remove the tags, but i want everything in between gone as well.

Any help would be appreciated.

like image 443
Andy Avatar asked Apr 13 '10 14:04

Andy


People also ask

What does it mean to strip HTML?

stripHtml( html ) Changes the provided HTML string into a plain text string by converting <br> , <p> , and <div> to line breaks, stripping all other tags, and converting escaped characters into their display values.

How do I remove text tags in HTML?

The HTML tags can be removed from a given string by using replaceAll() method of String class. We can remove the HTML tags from a given string by using a regular expression. After removing the HTML tags from a string, it will return a string as normal text.


2 Answers

As you’re dealing with HTML, you should use an HTML parser to process it correctly. You can use PHP’s DOMDocument and query the elements with DOMXPath, e.g.:

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//h1') as $node) {
    $node->parentNode->removeChild($node);
}
$html = $doc->saveHTML();
like image 146
Gumbo Avatar answered Oct 02 '22 20:10

Gumbo


Try this:

preg_replace('/<h1[^>]*>([\s\S]*?)<\/h1[^>]*>/', '', '<h1>including this content</h1>');

Example:

echo preg_replace('/<h1[^>]*>([\s\S]*?)<\/h1[^>]*>/', '', 'Hello<h1>including this content</h1> There !!');

Output:

Hello There
like image 40
Sarfraz Avatar answered Oct 02 '22 21:10

Sarfraz