Is there a good way to remove HTML from a Java string? A simple regex like
replaceAll("\\<.*?>", "")
will work, but things like &
wont be converted correctly and non-HTML between the two angle brackets will be removed (i.e. the .*?
in the regex will disappear).
Strip_tags() is a function that allows you to strip out all HTML and PHP tags from a given string (parameter one), however you can also use parameter two to specify a list of HTML tags you want.
To remove html tags from string in react js, just use the /(<([^>]+)>)/ig regex with replace() method it will remove tags with their attribute and return new string.
The strip_tags() function strips a string from HTML, XML, and PHP tags. Note: HTML comments are always stripped. This cannot be changed with the allow parameter.
Use a HTML parser instead of regex. This is dead simple with Jsoup.
public static String html2text(String html) { return Jsoup.parse(html).text(); }
Jsoup also supports removing HTML tags against a customizable whitelist, which is very useful if you want to allow only e.g. <b>
, <i>
and <u>
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With