I am trying to download and extract a .csv file from a webpage using R.
This question is a duplicate of Using R to download zipped data file, extract, and import data.
I cannot get the solution to work, but it may be due to the web address i am using.
I am trying to download the .csv files from http://data.worldbank.org/country/united-kingdom (under the download data drop down)
Using @Dirk's solution from the link above, i tried
temp <- tempfile()
download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv",temp)
con <- unz(temp, "gbr_Country_en_csv_v2.csv")
dat <- read.table(con, header=T, skip=2)
unlink(temp)
I got the extended link by looking at the page source code, which I expect is causing the problems, although it works if i paste it into the address bar.
The file downloads with the correct Gb
download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv",temp)
# trying URL 'http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv'
# Content type 'application/zip' length 332358 bytes (324 Kb)
# opened URL
# downloaded 324 Kb
# also tried unzip but get this warning
con <- unzip(temp, "gbr_Country_en_csv_v2.csv")
# Warning message:
# In unzip(temp, "gbr_Country_en_csv_v2.csv") :
# requested file not found in the zip file
But these are the file names when i manually download them.
I'd appreciate some help with where i am going wrong , thanks
I am using Windows 8, R version 3.1.0
You can use the download. file() function in R that allows us to download the zip file. The unzip() function in R allows you to unzip the zip file. Finally, you can use the read.
Instead of losing time unzipping the file manually, it's perfectly fine to load these files directly into R.
To unzip a single file or folder, open the zipped folder, then drag the file or folder from the zipped folder to a new location. To unzip all the contents of the zipped folder, press and hold (or right-click) the folder, select Extract All, and then follow the instructions.
In order to get your data to download and uncompress, you need to set mode="wb"
download.file("...",temp, mode="wb")
unzip(temp, "gbr_Country_en_csv_v2.csv")
dd <- read.table("gbr_Country_en_csv_v2.csv", sep=",",skip=2, header=T)
It looks like the default is "w" which assumes a text files. If it was a plain csv file this would be fine. But since it's compressed, it's a binary file, hence the "wb". Without the "wb" part, you can't open the zip at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With