I want to create an R script that, among other things, downloads baseball player projection data from http://www.fangraphs.com/projections.aspx?pos=all&stats=bat&type=zips. There is a link to export this data to .csv on the webpage near the top right corner of the data table but it appears to be a JavaScript command (javascript:__doPostBack('ProjectionBoard1$cmdCSV',''). I am familiar with using download.file()
using a link to a .csv file but am not sure how to approach this.
How can I use R to extract this data?
Creating the download linkCreate an object URL for the blob object. Create an anchor element ( <a></a> ) Set the href attribute of the anchor element to the created object URL. Set the download attribute to the filename of the file to be downloaded.
The donwload isn't a simple response that can be easily retrieved with download.file
. The web page constructs a FORM with some huge parameters that store the state of the web page, then pass this (and a load of cookies too) to the server to get the CSV response.
To make this work in R (or any other programming language) you need to construct that response, which you can usually only do by first getting the web page, scraping the FORM parameters (and cookies), then constructing the precise POST request you did when you clicked on the link.
This might be possible with RCurl, and it can sometimes be easier if you have a browser that can save the POST request parameter from its developer tools so you can then get RCurl to read them.
Another common technique in web scraping is to essentially run a browser that can be automated by a scripting language. There's an R package that leverages Selenium that might be able to do this:
http://cran.r-project.org/web/packages/RSelenium/index.html
There are some related (but not duplicate) Q's here, such as:
How to use R to download a zipped file from a SSL page that requires cookies
An R-help posting from a couple of years ago has some suggestions too:
https://stat.ethz.ch/pipermail/r-help//2012-September/335769.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With