Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Web scraping requiring a mouse click?

Tags:

r

web-scraping

I am trying (in R) to scrape some data from:

http://www.soccerbase.com/matches/results.sd?date=2012-11-04

namely, I want to get the match details which appear on the page when you press the i button. However, the information that appears having clicked on the button is not contained in the original html code. All I can see is a line (where I expected the data to be contained)...

<span class="infoField"><a href="#" class="info finished" title="Show full match details"></a></span>

...which pretty much leaves me at a dead end...any ideas?

like image 409
guyabel Avatar asked Oct 06 '22 02:10

guyabel


1 Answers

require(XML)
require(RCurl)
dataurl<-'http://www.soccerbase.com/matches/results.sd?date=2012-11-04'
sdata<-htmlParse(dataurl)
sid<-xpathSApply(sdata,'//*/tr/@id')
sid<-gsub('^tgc','',sid)
mUrl<-paste0('http://www.soccerbase.com/matches/additional_information.sd?id_game=',sid)

The above code would get the required urls for the additional data. However I would check with the site with regards to harvesting their data.

like image 101
user1609452 Avatar answered Oct 10 '22 04:10

user1609452