I'm reading Scrapy/XPath tutorials but this does not seem trivial and I can't find an example that would explain it.
Given a markup like this how would you select the <span>
element?
<div id=”...”>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<span>
If we generalize the problem it would be:
In addition to containing text, elements can contain other elements. Elements inside of other elements are called nested elements. Sometimes the outer element will just contain other elements, as in the example above.
Another easy approach to select nested elements is using the child selector. To select all the nested elements using a child selector we put a > symbol between the parent and the child. For example, div > p , div > p > span , etc. A child selector does also select nested elements like the descendant selector.
The * selector selects all elements. The * selector can also select all elements inside another element (See "More Examples").
When we use a CSS preprocessor like Sass or Less, we can nest a CSS style rule within another rule to write clean and understandable code. This nesting rule is not supported yet in native CSS. At the moment, it is a working draft and only available for discussion.
Assuming indentation denotes containment in your example, the following XPath will select the span
element for you:
//div[@id='...']/div[3]/div[2]/div/div/span
Of course, if there are no other span
elements beneath the id'ed div
, you could jump right to it:
//div[@id='...']//span
Or if there are no other span
elements in the entire document:
//span
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With