I would like to read online data to R using download.file()
as shown below.
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" download.file(URL, destfile = "./data/data.csv", method="curl")
Someone suggested to me that I add the line setInternet2(TRUE)
, but it still doesn't work.
The error I get is:
Warning messages: 1: running command 'curl "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" -o "./data/data.csv"' had status 127 2: In download.file(URL, destfile = "./data/data.csv", method = "curl", : download had nonzero exit status
Appreciate your help.
Generally, downloading a file from a HTTP server endpoint via HTTP GET consists of the following steps: Construct the HTTP GET request to send to the HTTP server. Send the HTTP request and receive the HTTP Response from the HTTP server. Save the contents of the file from HTTP Response to a local file.
Grab file with curl run: $ curl https://your-domain/file.pdf. Get file using ftp or sftp protocol: $ curl ftp://ftp-your-domain-name/file.tar.gz. You can set the output file name while downloading file with the curl, execute: $ curl -o file. pdf https://your-domain-name/long-file-name.pdf.
It might be easiest to try the RCurl package. Install the package and try the following:
# install.packages("RCurl") library(RCurl) URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" x <- getURL(URL) ## Or ## x <- getURL(URL, ssl.verifypeer = FALSE) out <- read.csv(textConnection(x)) head(out[1:6]) # RT SERIALNO DIVISION PUMA REGION ST # 1 H 186 8 700 4 16 # 2 H 306 8 700 4 16 # 3 H 395 8 100 4 16 # 4 H 506 8 700 4 16 # 5 H 835 8 800 4 16 # 6 H 989 8 700 4 16 dim(out) # [1] 6496 188 download.file("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv",destfile="reviews.csv",method="libcurl")
Here's an update as of Nov 2014. I find that setting method='curl'
did the trick for me (while method='auto'
, does not).
For example:
# does not work download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip', destfile='localfile.zip') # does not work. this appears to be the default anyway download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip', destfile='localfile.zip', method='auto') # works! download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip', destfile='localfile.zip', method='curl')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With