Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

xarray equivalent to pandas subtract/add

I'm looking for a concise way to do arithmetics on a single dimension of a DataArray, and then have the result returned as a new DataArray (both the changed and unchanged parts). In pandas, I would do this using df.subtract(), but I haven't found the way to do this with xarray.

Here's how I would subtract the value 2 from the x dimension in pandas:

data = np.arange(0,6).reshape(2,3)
xc = np.arange(0, data.shape[0])
yc = np.arange(0, data.shape[1])

df1 = pd.DataFrame(data, index=xc, columns=yc)
df2 = df1.subtract(2, axis='columns') 

For xarray though I don't know:

da1 = xr.DataArray(data, coords={'x': xc, 'y': yc}, dims=['x' , 'y'])
da2 = ?
like image 914
user8188435 Avatar asked Jun 20 '17 12:06

user8188435


2 Answers

In xarray, you can subtract from the rows or columns of an array by using broadcasting by dimension name.

For example:

>>> foo = xarray.DataArray([[1, 2, 3], [4, 5, 6]], dims=['x', 'y'])

>>> bar = xarray.DataArray([1, 4], dims='x')

# subtract along 'x'
>>> foo - bar
<xarray.DataArray (x: 2, y: 3)>
array([[0, 1, 2],
       [0, 1, 2]])
Dimensions without coordinates: x, y

>>> baz = xarray.DataArray([1, 2, 3], dims='y')

# subtract along 'y'
>>> foo - baz
<xarray.DataArray (x: 2, y: 3)>
array([[0, 0, 0],
       [3, 3, 3]])
Dimensions without coordinates: x, y

This works similar to axis='columns' vs axis='index' options that pandas provides, except the desired dimension is referenced by name.

like image 175
shoyer Avatar answered Jan 01 '23 02:01

shoyer


When you do:

df1 = pd.DataFrame(data, index=xc, columns=yc)
df2 = df1.subtract(2, axis='columns')

You really are just subtracting 2 from the entire dataset...

Here is your output from above:

In [15]: df1
Out[15]: 
   0  1  2
0  0  1  2
1  3  4  5

In [16]: df2
Out[16]: 
   0  1  2
0 -2 -1  0
1  1  2  3

Which is equivalent to:

df3 = df1.subtract(2)

In [20]: df3

Out[20]: 
   0  1  2
0 -2 -1  0
1  1  2  3

And equivalent to:

df4 = df1 -2

In [22]: df4

Out[22]: 
   0  1  2
0 -2 -1  0
1  1  2  3

Therefore, for an xarray data array:

da1 = xr.DataArray(data, coords={'x': xc, 'y': yc}, dims=['x' , 'y'])

da2 = da1-2

In [24]: da1

Out[24]: 
<xarray.DataArray (x: 2, y: 3)>
array([[0, 1, 2],
       [3, 4, 5]])
Coordinates:
  * y        (y) int64 0 1 2
  * x        (x) int64 0 1

In [25]: da2

Out[25]: 
<xarray.DataArray (x: 2, y: 3)>
array([[-2, -1,  0],
       [ 1,  2,  3]])
Coordinates:
  * y        (y) int64 0 1 2
  * x        (x) int64 0 1

Now, if you would like to subtract from a specific column, that's a different problem, which I believe would require assignment indexing.

like image 36
Maria Molina Avatar answered Jan 01 '23 00:01

Maria Molina