I have a netcdf file (global domain) which I do not know its projection information, and I want to extract the data (with lon and lat) based on the shapefile.
The objective is to find data for the domain represented by the shapefile ( the original netcdf contains global data). Besides, the final data format should be a data frame, which contains lon, lat, and data. I assume that the extract
and mask
function will be useful.
The netcdf and shapefile can be found at https://www.dropbox.com/s/ju96eitzzd0xxg8/precip.2000.nc?dl=0 and https://www.dropbox.com/s/8wfgf8207dbh79r/gpr_000b11a_e.zip?dl=0. Thanks for any help.
library(rgdal)
library(ncdf4)
library(raster)
shp = readOGR("gpr_000b11a_e.shp")
pre1 = nc_open("precip.2000.nc")
pre1
File precip.2000.nc (NC_FORMAT_NETCDF4_CLASSIC):
1 variables (excluding dimension variables):
float precip[lon,lat,time]
missing_value: -9.96920996838687e+36
var_desc: Precipitation
level_desc: Surface
statistic: Total
parent_stat: Other
long_name: Daily total of precipitation
cell_methods: time: sum
avg_period: 0000-00-01 00:00:00
actual_range: 0
actual_range: 482.555358886719
units: mm
valid_range: 0
valid_range: 1000
dataset: CPC Global Precipitation
3 dimensions:
lat Size:360
actual_range: 89.75
actual_range: -89.75
long_name: Latitude
units: degrees_north
axis: Y
standard_name: latitude
coordinate_defines: center
lon Size:720
long_name: Longitude
units: degrees_east
axis: X
standard_name: longitude
actual_range: 0.25
actual_range: 359.75
coordinate_defines: center
time Size:366 *** is unlimited ***
long_name: Time
axis: T
standard_name: time
coordinate_defines: start
actual_range: 876576
actual_range: 885336
delta_t: 0000-00-01 00:00:00
avg_period: 0000-00-01 00:00:00
units: hours since 1900-01-01 00:00:00
7 global attributes:
Conventions: CF-1.0
version: V1.0
history: created 9/2016 by CAS NOAA/ESRL PSD
title: CPC GLOBAL PRCP V1.0
References: https://www.esrl.noaa.gov/psd/data/gridded/data.cpc.globalprecip.html
dataset_title: CPC GLOBAL PRCP V1.0
Source: ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/
As we can see, there is no information about the projection or crs for the netcdf file.
You can read netCDF data in R using the ncdf4 package. You can use the nc_open function from the ncdf4 package to open a connection to a netCDF file.
Start ArcMap. Select the tool "Make NetCDF Raster Layer" by either using the "Search" or expanding the branch "Multidimension Tools" in the "ArcToolbox". For the input field "Input netCDF File" select the NetCDF file downloaded in step 1. Click the Variable drop-down arrow and choose a variable from the list.
You need to open the NetCDF file as a raster* object to process as a raster it in R. Use brick
or stack
for this, rather than nc_open
pre1.brick = brick("precip.2000.nc")
You will notice that this file uses the normal convention in climate science of giving longitudes in values ranging from 0 to 360 degrees:
extent(pre1.brick)
# class : Extent
# xmin : 0
# xmax : 360
# ymin : -90
# ymax : 90
We can convert to conventional -180 to 180 longitudes usint rotate
pre1.brick = rotate(pre1.brick)
Now lets see what the projections of our raster and polygon files are:
crs(shp)
# +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
crs(pre1.brick)
# +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
You can see that both use lat-long coordinates, but differ in terms of datum and ellipse. It is usually recommended for accuracy and speed, when possible, to project the vector data rather than the raster to get them both in the same coordinate system:
shp = spTransform(shp, crs(pre1.brick))
Having projected both to the same coord system, we can apply the shapefile as a mask to the raster brick:
pre1.mask = mask(pre1.brick, shp)
And convert to a data.frame. Here I just demonstrate for the first layer. You can do all layers at once if you prefer, by removing [[1]]
in the following line
pre1.df = as.data.frame(pre1.mask[[1]], xy=TRUE)
If you want, you can remove lines that have NA, to only leave cells inside the mask containing data:
pre1.df = pre1.df[complete.cases(pre1.df),]
head(pre1.df)
# x y X2000.01.01.00.00.00
# 10278 -81.25 82.75 0.2647110
# 10279 -80.75 82.75 0.2721323
# 10280 -80.25 82.75 0.2797630
# 10281 -79.75 82.75 0.2875668
# 10282 -79.25 82.75 0.2925712
# 10283 -78.75 82.75 0.2995636
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With