Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match html tags in c# regex matcher

Tags:

c#

regex

I am trying to match the contens in between the </strong> ....<strong> in the following

"\n<strong>Address:</strong> <br>\nHennessy Park Hotel, 65, Ebene Cybercity, Ebene \n<br>\<strong>Phone:</strong> (230) 403 7200<br>\n\n<strong>Fax:</strong> (230) 403 7201<br>\n<strong>Contact:</strong> <a href="/contact.html">Send E-mail</a><br>\n\n<p>\n"

Is there a way to do that. Currently i am using

Match result = Regex.Match(box.InnerHtml, @"<\/strong>(.*?)<strong>", RegexOptions.ECMAScript);

But I am not getting the contents that i wanted.

like image 658
user2327579 Avatar asked Feb 03 '26 13:02

user2327579


2 Answers

This will allow you to select the contents including the strong tags across multiple lines.

<strong>(.|\n)*?<\/strong>

This will allow you to select the content between the strong tags:

<strong>([\s\S]*?)<\/strong>

This will allow you to select the content between the end strong tag and the start strong tag:

<\/strong>([\s\S]*?)<strong>

I hope this helps!

like image 198
jdaval Avatar answered Feb 05 '26 02:02

jdaval


To do it with C#, you would change your call to add RegexOptions.Singleline. Since the RegexOptions class has the [FlagsAttribute], you can combine multiple flags with the bitwise-or operator (|).

Your new call would look like this (scroll to the end):

Match result = Regex.Match(box.InnerHtml, @"<\/strong>(.*?)<strong>", RegexOptions.ECMAScript | RegexOptions.Singleline);

See this article for a full description of using the flags: https://msdn.microsoft.com/en-us/library/yd1hzczs%28v=vs.110%29.aspx

like image 42
Pluto Avatar answered Feb 05 '26 03:02

Pluto