Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTMLAgilityPack - You need to set UseIdAttribute property to true to enable this feature

I am trying to use HTMLAgilityPack with VS2008/.Net 3.5. I get this error even if I set the OptionUseIdAttribute to true, though it is supposed to be true by default.

Error Message:
 You need to set UseIdAttribute property to true to enable this feature

Stack Trace:
    at HtmlAgilityPack.HtmlDocument.GetElementbyId(String id)

I tried version 1.4.6 and 1.4.0, neither worked.

Version 1.4.6 - Net20/HtmlAgilityPack.dll

Version 1.4.0 - Net20/HtmlAgilityPack.dll

This is the code,

    HtmlWeb web = new HtmlWeb();
    HtmlDocument doc = web.Load(url);
    HtmlNode table = doc.GetElementbyId("tblThreads");

This didn't work either,

    HtmlWeb web = new HtmlWeb();
    HtmlDocument doc = new HtmlDocument { OptionUseIdAttribute = true };
    doc = web.Load(url);
    HtmlNode table = doc.GetElementbyId("tblThreads");

How can I fix this issue? Thanks.

like image 308
user471317 Avatar asked Oct 18 '13 17:10

user471317


1 Answers

First I used ILSpy on the 1.4.0 HAP Dll. I navigated to the HtmlDocument class and could see that the GetElementById method looks like this:

// HtmlAgilityPack.HtmlDocument
/// <summary>
/// Gets the HTML node with the specified 'id' attribute value.
/// </summary>
/// <param name="id">The attribute id to match. May not be null.</param>
/// <returns>The HTML node with the matching id or null if not found.</returns>
public HtmlNode GetElementbyId(string id)
{
    if (id == null)
    {
        throw new ArgumentNullException("id");
    }
    if (this._nodesid == null)
    {
        throw new Exception(HtmlDocument.HtmlExceptionUseIdAttributeFalse);
    }
    return this._nodesid[id.ToLower()] as HtmlNode;
}

I then got ILSpy to analyze "_nodesid", because in your case for some reason it is not being set. "HtmlDocument.DetectEncoding(TextReader)" and "HtmlDocument.Load(TextReader)" assigns value to "_nodesid".

Hence you could try an alternative method to read the content from the URL whereby the "_nodesid" value will be definitely assigned e.g.

var doc = new HtmlDocument();
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
using (var response = (HttpWebResponse)request.GetResponse())
{
    using (var stream = response.GetResponseStream())
    {
        doc.Load(stream);
    }
}
var table = doc.GetElementbyId("tblThreads");

This approach ensures that "HtmlDocument.Load(TextReader)" is called, and in that code I can see that _nodesid will definitely get assigned, so this approach may (I haven't compiled the code I've suggested) work.

like image 110
Ben Smith Avatar answered Oct 18 '22 14:10

Ben Smith