I am trying to parse some online weather data with R. The data is a binary file that has been gzipped. An example file is:
ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz
If I download the file to my computer and manually unzip it, I can easily do the following:
myFile <- ( "/tmp/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101" )
to.read = file( myFile, "rb")
myPoints <- readBin(to.read, real(), n=1e6, size = 4, endian = "little")
What I would prefer to do is automate both the download/unzip along with the read. So I thought that would be as simple as the following:
p <- gzcon( url( "ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz" ) )
myPoints <- readBin(p, real(), n=1e6, size = 4, endian = "little")
This seems to work just dandy, but in the manual step the vector myPoints
has length 518400, which is accurate. However if R handles the download and read as in the second example, I get a different length vector every time I run the code. Seriously. I'm not smoking anything. I swear. I run it multiple time and each time the vector is a different length, always less than the expected 518400.
I also tried getting R to download the gzip file using the following:
temp <- tempfile()
myFile <- download.file("ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz",temp)
I found that often that would return a Warning about the file not being the expected size. Like the following:
Warning message:
In download.file("ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz", :
downloaded length 162176 != reported length 179058
Any tips you can throw my way that might help me solve this?
-J
Try this:
R> remfname <- "ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz"
R> locfname <- "/tmp/data.gz"
R> download.file(remfname, locfname)
trying URL 'ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz'
ftp data connection made, file length 179058 bytes
opened URL
==================================================
downloaded 174 Kb
R> con <- gzcon(file(locfname, "rb"))
R> myPoints <- readBin(con, real(), n=1e6, size = 4, endian = "little")
R> close(con)
R> str(myPoints)
num [1:518400] 0 0 0 0 0 0 0 0 0 0 ...
R>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With