Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTMLAgilityPack get innerText of a td tag with an id attribute

I am trying to select the inner text of a td with an id attribute with the HTMLAgilityPack.

Html Code:

<td id="header1">    5    </td>
<td id="header2">    8:39pm    </td>
<td id="header3">    8:58pm    </td>
...

Code:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

doc.LoadHtml(data);

var nodes = doc.DocumentNode.SelectNodes("//td[@id='header1']");

if (nodes != null)
{
    foreach (HtmlAgilityPack.HtmlNode node in nodes)
    {
        MessageBox.Show(node.InnerText);
    }
}

I keep getting null nodes because I am not selecting the td tag correctly but cannot figure out what I have done wrong...

Edit:

I made a mistake with header1 and header2, but there are 5 different td tags with headers 1 to 5.

like image 502
cheeseman Avatar asked Mar 16 '13 11:03

cheeseman


2 Answers

You are trying to select header1 but the id is header2.

You could also use GetElementById directly:

var td = doc.GetElementbyId("header2");
like image 140
Tim Schmelter Avatar answered Nov 14 '22 01:11

Tim Schmelter


Hmm.. I don't think you're doing anything wrong. Your code should give you only the <td> with id="header1". If you have, let's say, from header1 to header5, you can do:

for (int i = 1; i <= 5; i++ ) {
    var tdNode = doc.DocumentNode.SelectSingleNode(string.Format("//td[@id='header{0}']", i));

    //do something with the node here
}

although I suggest you posting your entire code, so that we can tell you why you're getting null, and also a better way of parsing the <td> nodes without doing the above loop (eg. something like //tr[@id='some-id']//td[contains(@id, 'header')].

like image 22
Oscar Mederos Avatar answered Nov 14 '22 02:11

Oscar Mederos