Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download a file from HTTPS using download.file()

Tags:

r

csv

https

I would like to read online data to R using download.file() as shown below.

URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" download.file(URL, destfile = "./data/data.csv", method="curl") 

Someone suggested to me that I add the line setInternet2(TRUE), but it still doesn't work.

The error I get is:

Warning messages: 1: running command 'curl  "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"  -o "./data/data.csv"' had status 127  2: In download.file(URL, destfile = "./data/data.csv", method = "curl",  :   download had nonzero exit status 

Appreciate your help.

like image 313
useR Avatar asked Apr 12 '14 09:04

useR


People also ask

How do I download a file from HTTP?

Generally, downloading a file from a HTTP server endpoint via HTTP GET consists of the following steps: Construct the HTTP GET request to send to the HTTP server. Send the HTTP request and receive the HTTP Response from the HTTP server. Save the contents of the file from HTTP Response to a local file.

How do I download a file from URL using curl?

Grab file with curl run: $ curl https://your-domain/file.pdf. Get file using ftp or sftp protocol: $ curl ftp://ftp-your-domain-name/file.tar.gz. You can set the output file name while downloading file with the curl, execute: $ curl -o file. pdf https://your-domain-name/long-file-name.pdf.


2 Answers

It might be easiest to try the RCurl package. Install the package and try the following:

# install.packages("RCurl") library(RCurl) URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" x <- getURL(URL) ## Or  ## x <- getURL(URL, ssl.verifypeer = FALSE) out <- read.csv(textConnection(x)) head(out[1:6]) #   RT SERIALNO DIVISION PUMA REGION ST # 1  H      186        8  700      4 16 # 2  H      306        8  700      4 16 # 3  H      395        8  100      4 16 # 4  H      506        8  700      4 16 # 5  H      835        8  800      4 16 # 6  H      989        8  700      4 16 dim(out) # [1] 6496  188  download.file("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv",destfile="reviews.csv",method="libcurl") 
like image 155
A5C1D2H2I1M1N2O1R2T1 Avatar answered Oct 23 '22 22:10

A5C1D2H2I1M1N2O1R2T1


Here's an update as of Nov 2014. I find that setting method='curl' did the trick for me (while method='auto', does not).

For example:

# does not work download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip',               destfile='localfile.zip')  # does not work. this appears to be the default anyway download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip',               destfile='localfile.zip', method='auto')  # works! download.file(url='https://s3.amazonaws.com/tripdata/201307-citibike-tripdata.zip',               destfile='localfile.zip', method='curl') 
like image 32
arvi1000 Avatar answered Oct 23 '22 21:10

arvi1000