I am trying to create a Java Application that retrieves information from a webpage. This is part of the code I am trying to access the value in the 1st td tag in the 2nd tr tag: <pre class="prettyprint"><code><TABLE CLASS="datadisplaytable" width = "100%"> <TR> <TD CLASS="dddead">&nbsp;</TD> <TH CLASS="ddheader" scope="col" >Capacity</TH> <TH CLASS="ddheader" scope="col" >Actual</TH> <TH CLASS="ddheader" scope="col" >Remaining</TH> </TR> <TR> <TH CLASS="ddlabel" scope="row" >Seats</TH> **<TD CLASS="dddefault">46</TD>** <TD CLASS="dddefault">46</TD> <TD CLASS="dddefault">0</TD> </TR> </code></pre> This is what i have right now but this only returns the class of the td tag and not the value inside it: <pre class="prettyprint"><code>List<?> table = page.getByXPath("//table[@class='datadisplaytable'][1]//tr[2]/td"); </code></pre> How would I go about getting the value of the td tag and not its properties? edit: The code above returns this: <pre class="prettyprint"><code>HtmlTableDataCell[<td class="dddefault">] </code></pre>

<blockquote> I am trying to create a Java Application that retrieves information from a webpage. This is part of the code I am trying to access the value in the 1st td tag in the 2nd tr tag: </blockquote> Assuming that the document is as shown in the question (<code>TABLE</code> is the top element), Use: <pre class="prettyprint"><code>/TABLE/TR[2]/TD[1]/text() </code></pre> This selects any text-node child of the first <code>TD</code> child of the second <code>TR</code> child of the top element <code>TABLE</code>. In case the table is buried in the XML document, but can be uniquely identified by its <code>CLASS</code> attribute, use: <pre class="prettyprint"><code>//TABLE[@CLASS='datadisplaytable']/TR[2]/TD[1]/text() </code></pre> This selects any text-node child of the first <code>TD</code> child of the second <code>TR</code> child of any (we know thre is only one such) element <code>TABLE</code> in the XML document, such that the string value of its <code>CLASS</code> attribute is the string <code>'datadisplaytable'</code>. Finally, if even worse, there could be many <code>TABLE</code> elements whose <code>CLASS</code> attribute's value is <code>'datadisplaytable'</code>, and we want to select in the first such table, use: <pre class="prettyprint"><code>(//TABLE[@CLASS='datadisplaytable'])[1]/TR[2]/TD[1]/text() </code></pre>

How do I get the value inside a <td> tag with xpath/htmlwebunit

Tags:

xpath

jwebunit

I am trying to create a Java Application that retrieves information from a webpage. This is part of the code I am trying to access the value in the 1st td tag in the 2nd tr tag:

<TABLE  CLASS="datadisplaytable" width = "100%">
<TR>
    <TD CLASS="dddead">&nbsp;</TD>
    <TH CLASS="ddheader" scope="col" ><SPAN class="fieldlabeltext">Capacity</SPAN></TH>
    <TH CLASS="ddheader" scope="col" ><SPAN class="fieldlabeltext">Actual</SPAN></TH>
    <TH CLASS="ddheader" scope="col" ><SPAN class="fieldlabeltext">Remaining</SPAN></TH>
</TR> 
<TR>
    <TH CLASS="ddlabel" scope="row" ><SPAN class="fieldlabeltext">Seats</SPAN></TH>
    **<TD CLASS="dddefault">46</TD>**
    <TD CLASS="dddefault">46</TD>
    <TD CLASS="dddefault">0</TD>
</TR>

This is what i have right now but this only returns the class of the td tag and not the value inside it:

List<?> table = page.getByXPath("//table[@class='datadisplaytable'][1]//tr[2]/td");

How would I go about getting the value of the td tag and not its properties?

edit: The code above returns this:

HtmlTableDataCell[<td class="dddefault">]

436

asked Feb 28 '12 18:02

KrispyDonuts

2 Answers

I am trying to create a Java Application that retrieves information from a webpage. This is part of the code I am trying to access the value in the 1st td tag in the 2nd tr tag:

Assuming that the document is as shown in the question (TABLE is the top element),

Use:

/TABLE/TR[2]/TD[1]/text()

This selects any text-node child of the first TD child of the second TR child of the top element TABLE.

In case the table is buried in the XML document, but can be uniquely identified by its CLASS attribute, use:

//TABLE[@CLASS='datadisplaytable']/TR[2]/TD[1]/text()

This selects any text-node child of the first TD child of the second TR child of any (we know thre is only one such) element TABLE in the XML document, such that the string value of its CLASS attribute is the string 'datadisplaytable'.

Finally, if even worse, there could be many TABLE elements whose CLASS attribute's value is 'datadisplaytable', and we want to select in the first such table, use:

(//TABLE[@CLASS='datadisplaytable'])[1]/TR[2]/TD[1]/text()

answered Dec 31 '22 19:12

Dimitre Novatchev

for getting the text content from an element there is an xpath function called "text()" which you can use.

Element containing text 't' exactly         //*[.='t']  
Element <E> containing text 't'             //*[.='t']  
<a> containing text 't'                     //a[contains(text(),'t')]
<a> with target link 'url'                  //a[@href='url']
Link URL labeled with text 't' exactly      //a[.='t']/@href

If you are also using JwebUnit, there is a method "getElementTextByXPath" which can also be used to get the text. net.sourceforge.jwebunit.junit.WebTestCase

getElementTextByXPath

public String getElementTextByXPath(String xpath) Deprecated. Get text of the given element. Parameters: xpath - xpath of the element.

    for (int i = 1; i != 6; i++) {

        String result = getElementTextByXPath("//td["+i+"][text()]");

        System.out.println("The Content of TD is " +result);
    }

answered Dec 31 '22 19:12

user1307037

Related questions
                            
                                XPath, XML Namespaces and Java
                            
                                XPath/HtmlAgilityPack: How to find an element (a) with a specific value for an attribute (href) and find adjacent table columns?
                            
                                Does Xpath standard support null value in attribute
                            
                                Getting Only the first result of a DOM xPath query
                            
                                XPath to first occurrence of element with text length >= 200 characters
                            
                                following-sibling::text() only if contains needle
                            
                                How to select multiple attributes for XPath query [duplicate]
                            
                                Java xpath to return an entire element as string
                            
                                XPath error The document has mutated since the result was returned
                            
                                XPath count in VBScript
                            
                                XSLT do not match certain attributes
                            
                                XSLT to sum product of two attributes
                            
                                XPath filter not empty child element
                            
                                jQuery - xpath find?
                            
                                WCF message logging - add filters with XPath queries
                            
                                How to get the nearest ancestor or child of an ancestor with xpath
                            
                                How to retrieve last node using XPath in C#?
                            
                                Loading local chunks in DOM while parsing a large XML file in SAX (Java)
                            
                                Xpath vs DOM vs BeautifulSoup vs lxml vs other Which is the fastest approach to parse a webpage?
                            
                                Need XSLT transform to remove duplicate elements - sorted by an attribute

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With