Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Beginner: Delete ALL between two characters in a string (Regex?)

Tags:

c#

regex

windows

i have a string with an html code. i want to remove all html tags. so all characters between < and >.

This is my code snipped:

WebClient wClient = new WebClient();
SourceCode = wClient.DownloadString( txtSourceURL.Text );
txtSourceCode.Text = SourceCode;
//remove here all between "<" and ">"
txtSourceCodeFormatted.Text = SourceCode;

hope somebody can help me

like image 470
taito Avatar asked Dec 01 '13 14:12

taito


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

Is C language easy?

Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.

What is C in C language?

What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.


2 Answers

Try this:

txtSourceCodeFormatted.Text = Regex.Replace(SourceCode, "<.*?>", string.Empty);

But, as others have mentioned, handle with care.

like image 171
Aage Avatar answered Oct 16 '22 21:10

Aage


According to Ravi's answer, you can use

string noHTML = Regex.Replace(inputHTML, @"<[^>]+>|&nbsp;", "").Trim();

or

string noHTMLNormalised = Regex.Replace(noHTML, @"\s{2,}", " ");
like image 3
Vignesh Kumar A Avatar answered Oct 16 '22 19:10

Vignesh Kumar A