Suppose I have the following DataArray
arr = xarray.DataArray(np.arange(6).reshape(2,3),
dims=['A', 'B'],
coords=dict(A=['a0', 'a1'],
B=['b0', 'b1', 'b2']))
I want to iterate over the first dimension and do the following (of course I want to do something more complex than printing)
for coor in arr.A.values:
print(coor, arr.sel(A=coor).values)
and get
a0 [0 1 2]
a1 [3 4 5]
I am new to xarray
, so I was wondering whether there was some more natural way to achieve this, something like
for coor, sub_arr in arr.some_method():
print(coor, sub_arr)
You can simply iterate over the DataArray - each element of the iterator will itself be a DataArray with a single value for the first coordinate:
for a in arr:
print(a.A.item(), a.values)
prints
a0 [0 1 2]
a1 [3 4 5]
Note the use of the .item()
method to access the scalar value of the zero-dimensional array a.A
.
To iterate over the second dimension, you can just transpose the data:
for b in arr.T: # or arr.transpose()
print(b.B.item(), b.values)
prints
b0 [0 3]
b1 [1 4]
b2 [2 5]
For multidimensional data, you can move the dimension you want to iterate over to the first place using ellipsis:
for x in arr.transpose("B", ...):
# x has one less dimension than arr, and x.B is a scalar
do_stuff_with(x)
The documentation on reshaping and reorganizing data has further details.
It's an old question, but I find that using groupby
is cleaner and makes more intuitive sense to me than using transpose when you want to iterate some dimension other than the first:
for coor, sub_arr in arr.groupby('A'):
print(coor)
print(sub_arr)
a0
<xarray.DataArray (B: 3)>
array([0, 1, 2])
Coordinates:
* B (B) <U2 'b0' 'b1' 'b2'
A <U2 'a0'
a1
<xarray.DataArray (B: 3)>
array([3, 4, 5])
Coordinates:
* B (B) <U2 'b0' 'b1' 'b2'
A <U2 'a1'
Also it seems that older versions of xarray don't handle the ellipsis correctly (see mgunyho's answer), but groupby still works correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With