Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select following sibling/XML tag using XPath

Tags:

xml

xpath

lxml

I have an HTML file (from Newegg) and their HTML is organized like below. All of the data in their specifications table is 'desc' while the titles of each section are in 'name.' Below are two examples of data from Newegg pages.

<tr>
    <td class="name">Brand</td>
    <td class="desc">Intel</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Core i5</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">LGA 1156</td>

<tr>
    <td class="name">Brand</td>
    <td class="desc">AMD</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Phenom II X4</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">Socket AM3</td>
</tr>

In the end I would like to have a class for a CPU (which is already set up) that consists of a Brand, Series, Cores, and Socket type to store each of the data. This is the only way I can think of to go about doing this:

if(parsedDocument.xpath(tr/td[@class="name"])=='Brand'):
    CPU.brand = parsedDocument.xpath(tr/td[@class="name"]/nextsibling?).text

And doing this for the rest of the values. How would I accomplish the nextsibling and is there an easier way of doing this?

like image 718
Corey Farwell Avatar asked Jun 29 '10 09:06

Corey Farwell


People also ask

How does XPath select sibling elements?

Select all A sibling elements that precede the context node. > Select all A sibling elements that follow the context node. > Select all sibling elements that precede the context node. > Select the first preceding sibling element named A in reverse document order.

What is sibling in XPath?

A Sibling in Selenium Webdriver is a function used to fetch a web element which is a sibling to the parent element. If the parent element is known then the web element can be easily found or located that can use the sibling attribute of the Xpath expression in selenium webdriver.

How selecting XML data is possible with XPath explain?

XPath uses path expressions to select nodes or node-sets in an XML document. These path expressions look very much like the expressions you see when you work with a traditional computer file system. XPath expressions can be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.


3 Answers

How would I accomplish the nextsibling and is there an easier way of doing this?

You may use:

tr/td[@class='name']/following-sibling::td

but I'd rather use directly:

tr[td[@class='name'] ='Brand']/td[@class='desc']

This assumes that:

  1. The context node, against which the XPath expression is evaluated is the parent of all tr elements -- not shown in your question.

  2. Each tr element has only one td with class attribute valued 'name' and only one td with class attribute valued 'desc'.

like image 54
Dimitre Novatchev Avatar answered Oct 16 '22 20:10

Dimitre Novatchev


Try the following-sibling axis (following-sibling::td).

like image 37
Philipp Avatar answered Oct 16 '22 18:10

Philipp


For completeness - adding to accepted answer above - in case you are interested in any sibling regardless of the element type you can use variation:

following-sibling::*

like image 5
Milan Avatar answered Oct 16 '22 18:10

Milan