I am facing a problem where I have to get the result from the child node which may or may not be parents to some other node using Xpath in scrapy. consider the case like
<h1 class="main">
<span class="child">data</span>
</h1>
or
<h1 class="main">
<span class="child">
<span class="child2">data</span>
</span>
</h1>
My solution was response.xpath(".//h1[@class='main']/span/text()").extract()
use //text
, and it will return all text elements in a list from within your span, both parent and child:
response.xpath(".//h1[@class='main']/span//text()").extract()
You can use:
response.xpath("string(.//h1[@class='main']/span)").extract()
response.xpath("string(.//h1[@class='main'])").extract()
if you're after the whole header textIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With