I'm trying to get part of a string.
Used this expression:
@"<a .*href=""(?<Url>(.*))(?="")"""
Example data to match:
var input = @"<html lang=""en"">
<head>
<link href=""http://www.somepage.com/c/main.css"" rel=""stylesheet"" type=""text/css"" />
<link rel=""canonical"" href=""http://www.somepage.com"" />
<script src=""http://www.somepage.com/professional/bower_components/modernizr/modernizr.js"" type=""text/javascript""></script>
</head>
<body>
<header>
<div>
<div>
<a aria-haspopup=""true"" href=""http://www.somepage.com/someotherpage""><img src=""http://www.somepage.com/i/sprite/logo.png"" alt=page"" /></a>
</div>
</div>
</header>
</body>
</html>"
For now I was able to get this value:
http://www.somepage.com/someotherpage\"><img src=""http://www.somepage.com/i/sprite/logo.png"" alt=page"" /></a>
with this code:
var regexPattern = new Regex(PATTERN, RegexOptions.IgnoreCase);
var matches = regexPattern.Matches(httpResult);
foreach (Match match in matches)
{
// here I'm getting this value
var extractedValue = match.Groups["Url"].Value; // it's value is http://www.somepage.com/someotherpage\"><img src=""http://www.somepage.com/i/sprite/logo.png"" alt=page"" /></a>
}
What I want to get under match.Groups["Url"].Value is simple http://www.somepage.com/someotherpage without anything after href attribute value.
Is it possible to get only that part of match without using Substring on extractedValue?
You were almost there. Just one minor change in your regex to not allow quotes in the matching set.
<a .*href=""(?<Url>([^"]*))(?="")""
//^^^^ This is what i changed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With