Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python xarray.DataArray: resize coordinates

For my Python package numericalmodel that is supposed to help prototyping simple numerical models I wrote classes for self-describing datasets. I recently stumbled upon the awesome xarray library and am now considering to use xarray.Dataset and xarray.DataArray instead of my own classes for data management.

I searched the xarray documentation but didn't find an answer to my question. Sorry if I overlooked something.

As far as I understand xarray, its structures are best used for static data that doesn't change its shape. But for my numericalmodel package I need to constantly expand arrays with new timely values, namely at each new time step. I don't know the points in time in advance because the time step lengths may be determined adaptively for each equation that is solved.

So my question:

How does one expand an xarray.DataArray's coordinate?

This might boil down to „How to expand an xarray.DataArray” in general

A possible method of doing this should:

  • not have too much overhead (e.g. not rely on creating new DataArrays, then reassigning coordinates, then deleting the old one)
  • possibly handle other data variables that depend on that coordinate

My guess is that it's not possible just like this. Normally, if you don't find anything on a question you have, it either means that the answer is bluntly obvious or it's very nontrivial ;-)

Thanks!

nobodyinperson

like image 669
NobodyInPerson Avatar asked Mar 27 '26 14:03

NobodyInPerson


1 Answers

The short answer is that this isn't possible to do efficiently with xarray's current data model (built on numpy arrays). Xarray doesn't support append, because append in numpy entails a complete copy.

See this recent mailing list discussion for details, but basically the currently recommended approach is to accumulate results in a list and call concat only once, at the end: https://groups.google.com/forum/m/#!topic/xarray/BAN7WobAfyw

That said, there are ways this could be done efficiently (with appends along one dimension in amortized constant time) if we built an alternative storage layer to use instead of numpy arrays. Xarray actually has a few of these already, so this is not as hard as it might seem. Anyways, if you're interested in potentially working on this, I am happy to discuss what next steps would look like in a GitHub issue.

like image 100
shoyer Avatar answered Mar 30 '26 03:03

shoyer



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!