Simple question: I don't only want the value of the maximum but also the coordinates of it in an xarray DataArray. How to do that?
I can, of course, write my own simple reduce function, but I wonder if there is anything built-in in xarray?
Update:
xarray now has the idxmax
method for selecting the coords of the max values along one dimension:
In [8]: da = xr.DataArray(
...: np.random.rand(2,3),
...: dims=list('ab'),
...: coords=dict(a=list('xy'), b=list('ijk'))
...: )
In [14]: da
Out[14]:
<xarray.DataArray (a: 2, b: 3)>
array([[0.63059257, 0.00155463, 0.60763418],
[0.19680788, 0.43953352, 0.05602777]])
Coordinates:
* a (a) <U1 'x' 'y'
* b (b) <U1 'i' 'j' 'k'
In [13]: da.idxmax('a')
Out[13]:
<xarray.DataArray 'a' (b: 3)>
array(['x', 'y', 'x'], dtype=object)
Coordinates:
* b (b) <U1 'i' 'j' 'k'
The below answer is still relevant for the maximum over multiple dimensions, though.
You can use da.where()
to filter based on the max value:
In [17]: da = xr.DataArray(
np.random.rand(2,3),
dims=list('ab'),
coords=dict(a=list('xy'), b=list('ijk'))
)
In [18]: da.where(da==da.max(), drop=True).squeeze()
Out[18]:
<xarray.DataArray ()>
array(0.96213673)
Coordinates:
a <U1 'x'
b <U1 'j'
Edit: updated the example to show the indexes more clearly, now that xarray doesn't have default indexes
An idxmax()
method would be very welcome in xarray, but nobody has gotten around to implementing it yet.
For now, you can find the coordinates of the maximum by combining argmax
and isel
:
>>> array = xarray.DataArray(
... [[1, 2, 3], [3, 2, 1]],
... dims=['x', 'y'],
... coords={'x': [1, 2], 'y': ['a', 'b', 'c']})
>>> array
<xarray.DataArray (x: 2, y: 3)>
array([[1, 2, 3],
[3, 2, 1]])
Coordinates:
* x (x) int64 1 2
* y (y) <U1 'a' 'b' 'c'
>>> array.isel(y=array.argmax('y'))
<xarray.DataArray (x: 2)>
array([3, 3])
Coordinates:
* x (x) int64 1 2
y (x) <U1 'c' 'a'
This is probably what .max()
should do in every case! Unfortunately we're not quite there yet.
The problem is that it doesn't yet generalize to the maximum over multiple dimensions in the way we would like:
>>> array.argmax() # what??
<xarray.DataArray ()>
array(2)
The problem is that it's automatically flattening, like np.argmax
. Instead, we probably want something like an array of tuples or a tuple of arrays, indicating the original integer coordinates for the maximum. Contributions for this would also be welcome -- see this issue for more details.
You can also use stack :
Let's say data is a 3d variable with time, longitude, latitude and you want the coordinate of the maximum through time.
stackdata = data.stack(z=('lon', 'lat'))
maxi = stackdata.argmax(axis=1)
maxipos = stackdata['z'][maxi]
lonmax = [maxipos.values[itr][0] for itr in range(ntime)]
latmax = [maxipos.values[itr][1] for itr in range(ntime)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With