Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Relative to absolute paths in HTML

I need to create a newsletters by URL. To do that, I:

  1. Create a WebClient.
  2. Use WebClient's method DownloadData to get a source of page in byte array;
  3. Get string from the source-html byte array and set it to the newsletter content.

However, I have some troubles with paths. All elements' sources were relative (/img/welcome.png) but I need an absolute one, like http://www.example.com/img/welcome.png.

How can I do this?

like image 491
Alex M Avatar asked Apr 27 '10 07:04

Alex M


1 Answers

One of the possible ways to resolve this task is the use the HtmlAgilityPack library.

Some example (fix links):

WebClient client = new WebClient();
byte[] requestHTML = client.DownloadData(sourceUrl);
string sourceHTML = new UTF8Encoding().GetString(requestHTML);

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(sourceHTML);

foreach (HtmlNode link in htmlDoc.DocumentNode.SelectNodes("//a[@href]"))
{
    if (!string.IsNullOrEmpty(link.Attributes["href"].Value))
    {
        HtmlAttribute att = link.Attributes["href"];
        att.Value = this.AbsoluteUrlByRelative(att.Value);
    }
}
like image 145
Alex M Avatar answered Nov 15 '22 09:11

Alex M