Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove a tag and its contents using regular expression?

Tags:

regex

php

$str = 'some text tag contents more text ';

My questions are: How to retrieve content tag <em>contents </em> which is between <MY_TAG> .. </MY_TAG>?

And

How to remove <MY_TAG> and its contents from $str?

I am using PHP.

Thank you.

like image 418
user187580 Avatar asked Mar 04 '10 18:03

user187580


People also ask

How to remove tag with content in PHP?

PHP provides an inbuilt function to remove the HTML tags from the data. The strip_tags() function is an inbuilt function in PHP that removes the strings form HTML, XML and PHP tags.

How do you remove a tag in CSS?

The internal or embedded CSS is used within the head section of the HTML document. It is enclosed within <style> tag. Approach: The jQuery remove() and empty() methods are used to remove the CSS style of <style> element.

Can we remove HTML tags from data Yes or no?

Yes, to remove tags.

How do you remove HTML tags in Python?

Remove HTML tags from string in python Using the Beautifulsoup Module. Like the lxml module, the BeautifulSoup module also provides us with various functions to process text data. To remove HTML tags from a string using the BeautifulSoup module, we can use the BeautifulSoup() method and the get_text() method.


1 Answers

I tested this function, it works for nested tags too, use true/false to exclude/include your tags. Found here: https://www.php.net/manual/en/function.strip-tags.php

<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
  $tags = array_unique($tags[1]);
   
  if(is_array($tags) AND count($tags) > 0) {
    if($invert == FALSE) {
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif($invert == FALSE) {
    return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return $text;
}




// Sample text:
$text = '<b>sample</b> text with <div>tags</div>';

// Result for:
echo strip_tags_content($text);
// text with

// Result for:
echo strip_tags_content($text, '<b>');
// <b>sample</b> text with

// Result for:
echo strip_tags_content($text, '<b>', TRUE);
// text with <div>tags</div>
like image 68
proseosoc Avatar answered Oct 03 '22 19:10

proseosoc