So I am trying to index a NetCDF file to get stream flow rate data in a certain grid cell. The NetCDF file I am using has the following characteristics:
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format NETCDF3):
CDI: Climate Data Interface version 1.6.4 (http://code.zmaw.de/projects/cdi)
Conventions: CF-1.4
dimensions(sizes): lon(3600), lat(1800), time(31)
variables(dimensions): float64 lon(lon), float64 lat(lat), float64 time(time), float32 dis(time,lat,lon)
I have 35+ years of this data and I am trying to get the data from an individual grid and create a time-series to compare it do a different model's forecasts. The code I am currently using to extract data from a grid cell is below.
from netCDF4 import Dataset
import numpy as np
root_grp = Dataset(r'C:\Users\wadear\Desktop\ERAIland_daily_dis_198001.nc')
dis = root_grp.variables['dis']
lat = np.round(root_grp.variables['lat'][:], decimals=2).tolist()
lon = np.round(root_grp.variables['lon'][:], decimals=2).tolist()
time = root_grp.variables['time'].shape[0]
lat_index = lat.index(27.95)
lon_index = lon.index(83.55)
for i in range(time):
print(dis[i][lat_index][lon_index])
Right now this feels really slow, and it will take a long time to do this over a 35+ year timespan, and while doing multiple different grid cells, the time it takes will really build up.
Is there a tool to speed up this process with faster I/O or indexing?
Thanks!
You should get a big time saving if you remove the loop over time and access the entire time series at once, i.e.
dis[:,lat_index,lon_index]
Further speed gains can be obtained if you apply chunking in the time dimension. Look up the documentation for nccopy
. If you need to access the time series repeatedly, this is worth doing. You may wish to concatenate some of your NetCDF files before chunking, e.g. monthly -> annual. This is done using ncrcat
utility.
See also Chunking Data: Why it Matters.
why not simply extract the point with CDO first and then read in the point data:
cdo remapnn,lon=83.55/lat=27.95 input.nc point_output.nc
on ubuntu if you don't have CDO installed, you can install it with
sudo apt-get install cdo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With