Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract the data in a netcdf file based on a shapefile in R?

I have a netcdf file (global domain) which I do not know its projection information, and I want to extract the data (with lon and lat) based on the shapefile.

The objective is to find data for the domain represented by the shapefile ( the original netcdf contains global data). Besides, the final data format should be a data frame, which contains lon, lat, and data. I assume that the extract and mask function will be useful.

The netcdf and shapefile can be found at https://www.dropbox.com/s/ju96eitzzd0xxg8/precip.2000.nc?dl=0 and https://www.dropbox.com/s/8wfgf8207dbh79r/gpr_000b11a_e.zip?dl=0. Thanks for any help.

library(rgdal)
library(ncdf4)
library(raster)

shp = readOGR("gpr_000b11a_e.shp")
pre1 = nc_open("precip.2000.nc")
pre1

File precip.2000.nc (NC_FORMAT_NETCDF4_CLASSIC):
 1 variables (excluding dimension variables):
    float precip[lon,lat,time]   
        missing_value: -9.96920996838687e+36
        var_desc: Precipitation
        level_desc: Surface
        statistic: Total
        parent_stat: Other
        long_name: Daily total of precipitation
        cell_methods: time: sum
        avg_period: 0000-00-01 00:00:00
        actual_range: 0
         actual_range: 482.555358886719
        units: mm
        valid_range: 0
         valid_range: 1000
        dataset: CPC Global Precipitation

 3 dimensions:
    lat  Size:360
        actual_range: 89.75
         actual_range: -89.75
        long_name: Latitude
        units: degrees_north
        axis: Y
        standard_name: latitude
        coordinate_defines: center
    lon  Size:720
        long_name: Longitude
        units: degrees_east
        axis: X
        standard_name: longitude
        actual_range: 0.25
         actual_range: 359.75
        coordinate_defines: center
    time  Size:366   *** is unlimited ***
        long_name: Time
        axis: T
        standard_name: time
        coordinate_defines: start
        actual_range: 876576
         actual_range: 885336
        delta_t: 0000-00-01 00:00:00
        avg_period: 0000-00-01 00:00:00
        units: hours since 1900-01-01 00:00:00

7 global attributes:
    Conventions: CF-1.0
    version: V1.0
    history: created 9/2016 by CAS NOAA/ESRL PSD
    title: CPC GLOBAL PRCP V1.0
    References: https://www.esrl.noaa.gov/psd/data/gridded/data.cpc.globalprecip.html
    dataset_title: CPC GLOBAL PRCP V1.0
    Source: ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/ 

As we can see, there is no information about the projection or crs for the netcdf file.

like image 809
Yang Yang Avatar asked May 06 '18 21:05

Yang Yang


People also ask

How do I read netCDF data in R?

You can read netCDF data in R using the ncdf4 package. You can use the nc_open function from the ncdf4 package to open a connection to a netCDF file.

How do I open a netCDF file in GIS?

Start ArcMap. Select the tool "Make NetCDF Raster Layer" by either using the "Search" or expanding the branch "Multidimension Tools" in the "ArcToolbox". For the input field "Input netCDF File" select the NetCDF file downloaded in step 1. Click the Variable drop-down arrow and choose a variable from the list.


1 Answers

You need to open the NetCDF file as a raster* object to process as a raster it in R. Use brick or stack for this, rather than nc_open

pre1.brick = brick("precip.2000.nc")

You will notice that this file uses the normal convention in climate science of giving longitudes in values ranging from 0 to 360 degrees:

extent(pre1.brick)
# class       : Extent 
# xmin        : 0 
# xmax        : 360 
# ymin        : -90 
# ymax        : 90 

We can convert to conventional -180 to 180 longitudes usint rotate

pre1.brick = rotate(pre1.brick)

Now lets see what the projections of our raster and polygon files are:

crs(shp)
# +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0 
crs(pre1.brick)
# +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 

You can see that both use lat-long coordinates, but differ in terms of datum and ellipse. It is usually recommended for accuracy and speed, when possible, to project the vector data rather than the raster to get them both in the same coordinate system:

shp = spTransform(shp, crs(pre1.brick))

Having projected both to the same coord system, we can apply the shapefile as a mask to the raster brick:

pre1.mask = mask(pre1.brick, shp)

And convert to a data.frame. Here I just demonstrate for the first layer. You can do all layers at once if you prefer, by removing [[1]] in the following line

pre1.df = as.data.frame(pre1.mask[[1]], xy=TRUE)

If you want, you can remove lines that have NA, to only leave cells inside the mask containing data:

pre1.df = pre1.df[complete.cases(pre1.df),]
head(pre1.df)
#            x     y X2000.01.01.00.00.00
# 10278 -81.25 82.75            0.2647110
# 10279 -80.75 82.75            0.2721323
# 10280 -80.25 82.75            0.2797630
# 10281 -79.75 82.75            0.2875668
# 10282 -79.25 82.75            0.2925712
# 10283 -78.75 82.75            0.2995636
like image 105
dww Avatar answered Sep 28 '22 16:09

dww