Scrapy getting href out of div

Question

I started to use Scrapy for a small project and I fail to extract the link. Instead of the url I get only "[]" for each time the class is found. Am I missing something obvious?

sel = Selector(response)
for entry in sel.xpath("//div[@class='recipe-description']"):
    print entry.xpath('href').extract()

Sample from the website:

<div class="recipe-description">
    <a href="http://www.url.com/">
        <h2 class="rows-2"><span>SomeText</span></h2>
    </a>
</div>

akhter wahab · Accepted Answer

your xpath query is wrong

for entry in sel.xpath("//div[@class='recipe-description']"):

in this line you are actually iterating our divs that doesn't have any Href attribute

for making it correct you should select achor elements in div:

for entry in sel.xpath("//div[@class='recipe-description']/a"):
    print entry.xpath('href').extract()

best possible solution is extract href attribute in for loop directly

for href in sel.xpath("//div[@class='recipe-description']/a/@href").extract():
    print href

for simplicity you can also use css selectors

for href in sel.css("div.recipe-description a::attr(href)").extract():
    print href

Scrapy getting href out of div

Tags:

python

web-scraping

scrapy

Trollbrot

1 Answers

akhter wahab

Recent Activity

Donate For Us

Scrapy getting href out of div

Tags:

python

web-scraping

scrapy

Trollbrot

1 Answers

akhter wahab

Related questions

Recent Activity

Donate For Us