Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to subset data using multidimensional coordinates using python xarray?

I have a netcdf file that uses multidimensional coordinates. My xarray dataset looks like this

<xarray.Dataset>
Dimensions:           (Time: 48, bottom_top: 50, bottom_top_stag: 51, 
soil_layers_stag: 4, south_north: 1015, south_north_stag: 1016, west_east: 1359, west_east_stag: 1360)
Coordinates:
XLAT              (Time, south_north, west_east) float32 18.1363 18.1456 ...
XLAT_U            (Time, south_north, west_east_stag) float32 18.1316 ...
XLAT_V            (Time, south_north_stag, west_east) float32 18.1198 ...
XLONG             (Time, south_north, west_east) float32 -122.884 ...
XLONG_U           (Time, south_north, west_east_stag) float32 -122.901 ...
XLONG_V           (Time, south_north_stag, west_east) float32 -122.879 ...
  * Time              (Time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...
  * south_north       (south_north) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * west_east         (west_east) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * bottom_top        (bottom_top) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...

Data variables:
GRAUPEL_ACC_NC    (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
P                 (Time, south_north, west_east) float32 101112.0 ...
PREC_ACC_NC       (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
QVAPOR            (Time, south_north, west_east) float32 0.0120251 ...
SNOW_ACC_NC       (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
TK                (Time, south_north, west_east) float32 295.372 295.367 ...
Z                 (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...

I am hoping to get a subset of the data through the physical coordinates (XLAT & XLONG). For example subsetting TK to get the gridpoints that fall within 49 to 55N and -125 to -115W.

Slicing the data does not work e.g. TK[782:898,179:409] because the gridpoints sliced do not follow the constant lines of latitude and longitude which I need.

There was an example using groupby.bins, however I have not been able to figure it out at all. I also attempted using where to mask the values outside of my domain, with no success.

If anyone has any suggestions, that would be greatly appreciated!

like image 486
jdiction Avatar asked Feb 05 '23 06:02

jdiction


1 Answers

This is a perfect use-case for where with drop=True. Something like the following should work:

ds.where((-125 < ds.XLON) & (ds.XLON < -115)
         & (49 < ds.XLAT) & (ds.XLAT < 55), drop=True)

where should work regardless, but one other concern with your dataset is that your spatial coordinates (XLON and XLAT) include "Time" as a dimension. Do these variables really vary over time? If not, you may want to adjust them to remove the time dimension.

like image 93
shoyer Avatar answered Feb 08 '23 17:02

shoyer