i have a string with an html code. i want to remove all html tags. so all characters between < and >.
This is my code snipped:
WebClient wClient = new WebClient();
SourceCode = wClient.DownloadString( txtSourceURL.Text );
txtSourceCode.Text = SourceCode;
//remove here all between "<" and ">"
txtSourceCodeFormatted.Text = SourceCode;
hope somebody can help me
C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...
Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.
What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.
In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.
Try this:
txtSourceCodeFormatted.Text = Regex.Replace(SourceCode, "<.*?>", string.Empty);
But, as others have mentioned, handle with care.
According to Ravi's answer, you can use
string noHTML = Regex.Replace(inputHTML, @"<[^>]+>| ", "").Trim();
or
string noHTMLNormalised = Regex.Replace(noHTML, @"\s{2,}", " ");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With