I want to get the href
value:
<span class="title"> <a href="https://www.example.com"></a> </span>
I tried this:
Link = Link1.css('span[class=title] a::text').extract()[0]
But I just get the text inside the <a>
. How can I get the link inside the href
?
CSS is a language for applying styles to HTML documents. It defines selectors to associate those styles with specific HTML elements. Scrapy Selectors is a thin wrapper around parsel library; the purpose of this wrapper is to provide better integration with Scrapy Response objects.
What you're looking for is:
Link = Link1.css('span[class=title] a::attr(href)').extract()[0]
Since you're matching a span
"class" attribute also, you can even write
Link = Link1.css('span.title a::attr(href)').extract()[0]
Please note that ::text
pseudo element and ::attr(attributename)
functional pseudo element are NOT standard CSS3 selectors. They're extensions to CSS selectors in Scrapy 0.20.
Edit (2017-07-20): starting from Scrapy 1.0, you can use .extract_first()
instead of .extract()[0]
Link = Link1.css('span[class=title] a::attr(href)').extract_first() Link = Link1.css('span.title a::attr(href)').extract_first()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With