I have a netCDF file that I wish to extract a subset from defined by latitude/longitude boundaries (i.e. a lat/long defined box), using the ‘ncdf’ package in R.
A summary of my netCDF file is below. It has two dimensions (latitude and longitude) and 1 variable (10U_GDS4_SFC). It is essentially a lat/long grid containing wind values:
[1] "file example.nc has 2 dimensions:"
[1] "lat_0 Size: 1280"
[1] "lon_1 Size: 2560"
[1] "------------------------"
[1] "file example.nc has 1 variables:"
[1] "float 10U_GDS4_SFC[lon_1,lat_0] Longname:10 metre U wind component Missval:1e+30"
The latitude variable runs from +90 to -90 and the longitude variable runs form 0 to 360.
I wish to extract a subset of the overall grid using the following geographical corner boundaries:
bottom left corner: Lat: 34.5˚, Long: 355˚, top left corner: Lat: 44.5˚, Long: 355˚, top right corner: Lat: 44.5˚, Long: 12˚, bottom right corner: Lat: 34.5˚ , Long: 12˚
I am aware that parts of a variable can be extracted using the get.var.ncdf()
command (example below):
z1 = get.var.ncdf(example.nc, "10U_GDS4_SFC", start=c(11,26), count=c(5,5))
However, I can’t work out how lat/long can be incorporated so that I end up with a subsetted spatial grid containing variable values. I am new to working with netCDF values in R, and any advice would be greatly appreciated. Many thanks!
In principle you are 2/3 of the way there. You can of course create the starting indices using something like this:
require(ncdf4)
ncFile <- nc_open( MyNetCDF )
LonStartIdx <- which( ncFile$dim$lon$vals == 355)
LatStartIdx <- which( ncFile$dim$lat$vals == 34.5)
Do the same for the counts. Then, read the variable you want
MyVariable <- ncvar_get( ncFile, varName, start=c( LonStartIdx, LatStartIdx), count=...)
However in your case you are out of luck as far as I know. The reading / writing netcdf routines do their stuff sequentially. Your grid wraps around since you have coordinates that go from 0 - 360 in longitude and you are interested in a box that contains the zero meridian.
For you (assuming you have not too much data) it would make more sense to read in the full grid into R, and then use either subset
or create indices using which
and cut out your "box" in R.
ncFile <- nc_open( MyNetCDF )
LonIdx <- which( ncFile$dim$lon$vals > 355 | ncFile$dim$lon$vals < 10)
LatIdx <- which( ncFile$dim$lat$vals > 34.5 & ncFile$dim$lat$vals < 44.5)
MyVariable <- ncvar_get( ncFile, varName)[ LonIdx, LatIdx]
nc_close(ncFile)
Remark: I prefer ncdf4
, I find the syntax a bit easier to remember (and there was another advantage over the older netcdf R-package that I have forgotten...)
Ok. Comments cannot be as long as I would need them, so I updated the answer No worries. Let's go through the questions step by step.
which
function way will work. I use it myself.The data will be in a similar format as in the netcf file, but I am not too sure if there is some problem with the 0 meridian (I guess yes). You might have to swap the two halves by doing something like this (replace the corresponding line in the 2nd example)
LonIdx <- c(which( ncFile$dim$lon$vals > 355) , which( ncFile$dim$lon$vals < 10) )
This changes the order of the coordinate indices so that the Western part comes first and then the Eastern.
Reformatting everything to a 2x3 data frame is possible. Take the data my 2nd code example returns (will be a matrix, [lon x lat]. Also get the values of the coordinates from
lon <- ncFile$dim$lon$val[LonIdx]
(or how longitude is called in your example, same for lat
). Then assemble the matrix using
cbind( rep(lat, each=length(lon)), rep(lon,length(lat)), c(myVariable) )
The coordinates will of course be the same as in the netcdf file...
You need to samity check the last cbind, as I am only about 98% confident that I have not messed up the coordinates. In the R scripts I found on my desktop I use loops, which are... evil... This should be (a bit?) faster and is also more sensible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With