Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove html special chars?

I am creating a RSS feed file for my application in which I want to remove HTML tags, which is done by strip_tags. But strip_tags is not removing HTML special code chars:

  & ©  

etc.

Please tell me any function which I can use to remove these special code chars from my string.

like image 351
djmzfKnm Avatar asked Mar 18 '09 10:03

djmzfKnm


People also ask

How do I remove special characters in HTML?

This should do what you're looking for: function clean($string) { $string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens. return preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars. } Hope it helpss!!

How do you change special characters in HTML?

replace(/>/g, "&gt;"). replace(/</g, "&lt;"). replace(/"/g, "&quot;");

How do I remove text formatting in HTML?

The HTML tags can be removed from a given string by using replaceAll() method of String class. We can remove the HTML tags from a given string by using a regular expression. After removing the HTML tags from a string, it will return a string as normal text.


2 Answers

Either decode them using html_entity_decode or remove them using preg_replace:

$Content = preg_replace("/&#?[a-z0-9]+;/i","",$Content);  

(From here)

EDIT: Alternative according to Jacco's comment

might be nice to replace the '+' with {2,8} or something. This will limit the chance of replacing entire sentences when an unencoded '&' is present.

$Content = preg_replace("/&#?[a-z0-9]{2,8};/i","",$Content);  
like image 95
schnaader Avatar answered Oct 11 '22 05:10

schnaader


Use html_entity_decode to convert HTML entities.

You'll need to set charset to make it work correctly.

like image 28
andi Avatar answered Oct 11 '22 05:10

andi