Regex Find Text Between Tags C#

Question

I want to strip the html tags and only return the text between the tags. Here is what I'm currently using.

string regularExpressionPattern1 = @"<td(.*?)<\/td>";
Regex regex = new Regex(regularExpressionPattern1, RegexOptions.Singleline);
MatchCollection collection = regex.Matches(value.ToString());

I currently get <td>13</td>, and I just want 13.

Thanks,

Yevgeniy.Chernobrivets · Accepted Answer

You need to get value of group not of the match. Try this

Match m = collection[0];
var stripped = m.Groups[1].Value;

Mike Webb · Answer

You can use look-behind ?<= and look-ahead ?= like this:

(?<=<td>)(.*?)(?=<\/td>)

That should give you just the text between the tags. More info on Regex and look-ahead/look-behind can be found Here.

Also, a good Regex tester can be found Here. I use it to test all my Regex strings when I'm writing them.

jessehouwing · Answer

So, using the HTML AgilityPack, this would be really easy...

 HtmlDocument  doc = doc.LoadHtml(value);
 var nodes = doc.DocumentNode.SelectNodes("//td//text()");

Puts the TextNodes in the nodes variable.

Regex Find Text Between Tags C#

Tags:

c#

regex

Trey Balut

3 Answers

Yevgeniy.Chernobrivets

Mike Webb

jessehouwing

Recent Activity

Donate For Us

Regex Find Text Between Tags C#

Tags:

c#

regex

Trey Balut

3 Answers

Yevgeniy.Chernobrivets

Mike Webb

jessehouwing

Related questions

Recent Activity

Donate For Us