Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply function along time dimension of XArray

I have an image stack stored in an XArray DataArray with dimensions time, x, y on which I'd like to apply a custom function along the time axis of each pixel such that the output is a single image of dimensions x,y.

I have tried: apply_ufunc but the function fails stating that I need to first load the data into RAM (i.e. cannot use a Dask Array). Ideally, I'd like to keep the DataArray as Dask Arrays internally as it isn't possible to load the entire stack into RAM. The exact error message is:

ValueError: apply_ufunc encountered a dask array on an argument, but handling for dask arrays has not been enabled. Either set the dask argument or load your data into memory first with .load() or .compute()

My code currently looks like this:

import numpy as np
import xarray as xr
import pandas as pd 

def special_mean(x, drop_min=False):
    s = np.sum(x)
    n = len(x)
    if drop_min:
    s = s - x.min()
    n -= 1
    return s/n

times = pd.date_range('2019-01-01', '2019-01-10', name='time')

data = xr.DataArray(np.random.rand(10, 8, 8), dims=["time", "y", "x"], coords={'time': times})
data = data.chunk({'time':10, 'x':1, 'y':1})

res = xr.apply_ufunc(special_mean, data, input_core_dims=[["time"]], kwargs={'drop_min': True})

If I do load the data into RAM using .compute then I still end up with an error which states:

ValueError: applied function returned data with unexpected number of dimensions: 0 vs 2, for dimensions ('y', 'x')

I'm not sure entirely what I am missing/doing wrong.

like image 726
System123 Avatar asked Jun 26 '26 02:06

System123


1 Answers

def special_mean(x, drop_min=False):
    s = np.sum(x)
    n = len(x)
    if drop_min:
        s = s - x.min()
    n -= 1
    return s/n

times = pd.date_range('2019-01-01', '2019-01-10', name='time')

data = xr.DataArray(np.random.rand(10, 8, 8), dims=["time", "y", "x"], coords={'time': times})
data = data.chunk({'time':10, 'x':1, 'y':1})

res = xr.apply_ufunc(special_mean, data, input_core_dims=[["time"]], kwargs={'drop_min': True}, dask = 'allowed', vectorize = True)

The code above using the vectorize argument should work.

like image 188
Ales Avatar answered Jun 28 '26 17:06

Ales