I have the html source of a page in a form of string with me:
<html>
<head>
<link rel="stylesheet" type="text/css" href="/css/all.css" />
</head>
<body>
<a href="/test.aspx">Test</a>
<a href="http://mysite.com">Test</a>
<img src="/images/test.jpg"/>
<img src="http://mysite.com/images/test.jpg"/>
</body>
</html>
I want to convert all the relative paths to absolute. I want the output be:
<html>
<head>
<link rel="stylesheet" type="text/css" href="http://mysite.com/css/all.css" />
</head>
<body>
<a href="http://mysite.com/test.aspx">Test</a>
<a href="http://mysite.com">Test</a>
<img src="http://mysite.com/images/test.jpg"/>
<img src="http://mysite.com/images/test.jpg"/>
</body>
</html>
Note: I want only the relative paths to be converted to absolute ones in that string. The absolute ones which are already in that string should not be touched, they are fine to me as they are already absolute. Can this be done by regex or other means?
Don't try to parse html with regex as expained here https://stackoverflow.com/a/1732454/932418 and https://stackoverflow.com/a/1758162/932418
Use an html parser like HtmlAgilityPack instead
string html =
@"<html>
<head>
<link rel=""stylesheet"" type=""text/css"" href=""/css/all.css"" />
</head>
<body>
<a href=""/test.aspx"">Test</a>
<a href=""http://example.com"">Test</a>
<img src=""/images/test.jpg""/>
<img src=""http://example.com/images/test.jpg""/>
</body>
</html>";
StringWriter writer = new StringWriter();
string baseUrl= "http://example.com";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
foreach(var img in doc.DocumentNode.Descendants("img"))
{
img.Attributes["src"].Value = new Uri(new Uri(baseUrl), img.Attributes["src"].Value).AbsoluteUri;
}
foreach (var a in doc.DocumentNode.Descendants("a"))
{
a.Attributes["href"].Value = new Uri(new Uri(baseUrl), a.Attributes["href"].Value).AbsoluteUri;
}
doc.Save(writer);
string newHtml = writer.ToString();
Add
<base href="http://mysite.com/images/" />
To the head of the page
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With