For example:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>title</title> </head> <body> <a href="aaa.asp?id=1"> I want to get this text </a> <div> <h1>this is my want!!</h1> <b>this is my want!!!</b> </div> </body> </html>
and the result is:
I want to get this text this is my want!! this is my want!!!
Select the HTML element which need to remove. Use JavaScript remove() and removeChild() method to remove the element from the HTML document.
Strip_tags() is a function that allows you to strip out all HTML and PHP tags from a given string (parameter one), however you can also use parameter two to specify a list of HTML tags you want.
The HTML tags can be removed from a given string by using replaceAll() method of String class. We can remove the HTML tags from a given string by using a regular expression. After removing the HTML tags from a string, it will return a string as normal text.
HTML Agility Pack:
HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(html); string s = doc.DocumentNode.SelectSingleNode("//body").InnerText;
Use this function...
public string Strip(string text) { return Regex.Replace(text, @"<(.|\n)*?>", string.Empty); }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With