I'm fetching the html document by URL using <code>WebClient.DownloadString(url)</code> but then its very hard to find the element content that I'm looking for. Whilst reading around I've spotted <code>HtmlDocument</code> and that it has neat things like <code>GetElementById</code>. How can I populate an <code>HtmlDocument</code> with the html returned by <code>url</code>?

The <code>HtmlDocument</code> class is a wrapper around the native <code>IHtmlDocument2</code> COM interface. You cannot easily create it from a string. You should use the HTML Agility Pack.

String to HtmlDocument

Tags:

html

c#

I'm fetching the html document by URL using WebClient.DownloadString(url) but then its very hard to find the element content that I'm looking for. Whilst reading around I've spotted HtmlDocument and that it has neat things like GetElementById. How can I populate an HtmlDocument with the html returned by url?

691

asked Feb 08 '11 16:02

lappy

3 Answers

The HtmlDocument class is a wrapper around the native IHtmlDocument2 COM interface.
You cannot easily create it from a string.

You should use the HTML Agility Pack.

114

answered Oct 20 '22 23:10

SLaks

Using Html Agility Pack as suggested by SLaks, this becomes very easy:

Click to copy

string html = webClient.DownloadString(url);
var doc = new HtmlDocument();
doc.LoadHtml(html);

HtmlNode specificNode = doc.GetElementById("nodeId");
HtmlNodeCollection nodesMatchingXPath = doc.DocumentNode.SelectNodes("x/path/nodes");

answered Oct 21 '22 01:10

Dan Tao

To answer the original question:

Click to copy

HTMLDocument doc = new HTMLDocument();
IHTMLDocument2 doc2 = (IHTMLDocument2)doc;
doc2.write(fileText);
// now use doc

Then to convert back to a string:

Click to copy

doc.documentElement.outerHTML;

answered Oct 20 '22 23:10

David Sherret

Related questions
                            
                                WebClient generates (401) Unauthorized error
                            
                                How to add an item to a drop down list in ASP.NET?
                            
                                Hangfire recurring tasks under minute
                            
                                How to remove a stack item which is not on the top of the stack in C#
                            
                                How do I take the Cartesian join of two lists in c#?
                            
                                Why would one use the |= operator on a boolean value in C#?
                            
                                Site in Azure Websites fails processing of X509Certificate2
                            
                                Add a custom response header in ApiController
                            
                                How to access HTML form input from ASP.NET code behind [closed]
                            
                                Redirect the parent page from IFrame
                            
                                How to get files in a relative path in C#
                            
                                JSON.net Serialize C# object to JSON Issue
                            
                                I can't find "Include" method with lambda expression in Entity framework?
                            
                                How do I access the children of an ItemsControl?
                            
                                Gravatar: How do I know if a user has a real picture
                            
                                How to write a code with expiration date?
                            
                                Entity Framework 4.1 InverseProperty Attribute
                            
                                FormStartPosition.CenterParent does not work
                            
                                Is there a Ruby equivalent for the typeof reserved word in C#?
                            
                                Error: Deleted row information cannot be accessed through the row

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

String to HtmlDocument

Tags:

html

c#

lappy

People also ask

3 Answers

SLaks

Dan Tao

David Sherret

Recent Activity

Donate For Us