I am trying (in R) to scrape some data from:
http://www.soccerbase.com/matches/results.sd?date=2012-11-04
namely, I want to get the match details which appear on the page when you press the i button. However, the information that appears having clicked on the button is not contained in the original html code. All I can see is a line (where I expected the data to be contained)...
<span class="infoField"><a href="#" class="info finished" title="Show full match details"></a></span>
...which pretty much leaves me at a dead end...any ideas?
require(XML)
require(RCurl)
dataurl<-'http://www.soccerbase.com/matches/results.sd?date=2012-11-04'
sdata<-htmlParse(dataurl)
sid<-xpathSApply(sdata,'//*/tr/@id')
sid<-gsub('^tgc','',sid)
mUrl<-paste0('http://www.soccerbase.com/matches/additional_information.sd?id_game=',sid)
The above code would get the required urls for the additional data. However I would check with the site with regards to harvesting their data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With