Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why // is needed instead of /

Tags:

xpath

scrapy

consider the following HTML code:

<html>
<head>      
  <title>Example website</title>
</head>
 <body>    
  <div>
  <table id='tableid'>
   <tr>
    <td>
        <a href="/blabla" title="Blabla1">Blabla1</a>
        <a href="/blabla" title="Blabla1">Blabla2</a>
        <a href="/blabla" title="Blabla1">Blabla3</a>
        <a href="/blabla" title="Blabla1">Blabla4</a>
    </td>
        <td>col2</td>
        <td>col3</td>
        <td>col4</td>
   </tr>
  </table>
 </body>
</html>

If I want to get all the links why do I have to use:

//table[@id="tableid"]//a/@href

instead if using a single / after the table? I'm alredy on the table node at that point (it should become my 'root') so / should be enough...

thanks in advance!

like image 362
Ignacio Verona Avatar asked Mar 22 '23 12:03

Ignacio Verona


1 Answers

A single / after table[@id="tableid"] would work if you only wanted immediate children of table. To get any descendent a of table[@id="tableid"], you need //a.

// is short for /descendant-or-self::node()/

The descendant-or-self axis contains the context node and the descendants of the context node. Since you're establishing the context node as table[@id="tableid"], you won't get any a elements other than those that are descendants of table[@id="tableid"].

like image 161
kjhughes Avatar answered Apr 26 '23 13:04

kjhughes