Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to fetch `href` from Reddit embedded feed window using scrapy

Tags:

python

scrapy

I am trying to fetch reddit account name from reddit feed window, from the following link :

fetch('https://coinmarketcap.com/currencies/ripple/')

Now, here I am able to fetch twitter account details successfully using following code:

#fetch the tweet account of coin
tweet_account = response.xpath('//a[starts-with(@href, "https://twitter.com")]/@href').extract()
tweet_account = [s for s in tweet_account if s != 'https://twitter.com/CoinMarketCap']
tweet_account = [s for s in tweet_account if len(s) < 60 ]
print(tweet_account) 

However, I am not able to get reddit account using similar method ??

reddit_account = response.xpath('//a[starts-with(@href, "https://www.reddit.com")]/@href').extract()
reddit_account = [s for s in reddit_account if s != 'https://www.reddit.com/r/CoinMarketCap'']
reddit_account = [s for s in reddit_account if len(s) < 60 ]
print(reddit_account)

Even I have tried fetching directly using simple xpath but it doesn't work :

response.xpath('//*[@id="reddit"]/div/div[1]/h4/a[2]/@href')

Output for :

response.xpath('//*[@id="reddit"]').extract() 

shows

<b>['<div id="reddit" class="col-sm-6 text-left">\n</div>']</b>

But there are many more tags inside this div tag?? why am I not able to get those tags??

Unfortunately, Scrapy is unable to find what is inside this div. This reddit feed even doesn't have an iframe. Is there any separate URL I should be calling??

Edit<\b> :

I did show(response) in shell. and it have twitter data but not reddit ?? why it should be ?

like image 846
priya kumari Avatar asked Mar 25 '19 06:03

priya kumari


1 Answers

All data won't be in the page source as shown in the website. If you are using google chrome browser please press ctrl+u to see the page source then ctrl+f to search for the data that you want. If it's not in the page source you may have to send some other requests to get the data.

like image 72
Agus Mathew Avatar answered Nov 17 '22 02:11

Agus Mathew