I've been using the <code>.append()</code> method to concatenate two tables (with the same fields) in pandas. Unfortunately this method does not exist in <code>xarray</code>, is there another way to do it?

Xarray doesn't have an append method because its data structures are built on top of NumPy's non-resizable arrays, so we cannot append new elements without copying the entire array. Hence, we don't implement an <code>append</code> method. Instead, you should use <code>xarray.concat</code>. One usual pattern is to accumulate Dataset/DataArray objects in a list, and concatenate once at the end: <pre class="prettyprint"><code>datasets = [] for example in examples: ds = create_an_xarray_dataset(example) datasets.append(ds) combined = xarray.concat(datasets, dim='example') </code></pre> You don't want to concatenate inside the loop -- that would make your code run in quadratic time. Alternatively, you could allocate a single Dataset/DataArray for the result, and fill in the values with indexing, e.g., <pre class="prettyprint"><code>dims = ('example', 'x', 'y') combined = xarray.Dataset( data_vars={'my_variable': (dims, np.zeros((len(examples), 100, 200)))}, coords={'example': examples}) for example in examples: combined.loc[dict(example=example)] = create_an_xarray_dataset(example) </code></pre> (Note that you always need to use indexing with square brackets like <code>[]</code> or <code>.loc[]</code> -- assigning with <code>sel()</code> and <code>isel()</code> doesn't work.) These two approaches are equally efficient -- it's really a matter of taste which one looks better to you or works better for your application. For what it's worth, pandas has the same limitation: the <code>append</code> method does indeed copy entire dataframes each time it is used. This is a perpetual surprise and source of performance issues for new users. So I do think that we made the right design decision not including it in xarray.

You can either use <code>.concat</code> or <code>merge()</code>. Documentation.

Is it possible to append to an xarray.Dataset?

2 Answers

Xarray doesn't have an append method because its data structures are built on top of NumPy's non-resizable arrays, so we cannot append new elements without copying the entire array. Hence, we don't implement an append method. Instead, you should use xarray.concat.

One usual pattern is to accumulate Dataset/DataArray objects in a list, and concatenate once at the end:

datasets = []
for example in examples:
    ds = create_an_xarray_dataset(example)
    datasets.append(ds)
combined = xarray.concat(datasets, dim='example')

You don't want to concatenate inside the loop -- that would make your code run in quadratic time.

Alternatively, you could allocate a single Dataset/DataArray for the result, and fill in the values with indexing, e.g.,

dims = ('example', 'x', 'y')
combined = xarray.Dataset(
    data_vars={'my_variable': (dims, np.zeros((len(examples), 100, 200)))},
    coords={'example': examples})
for example in examples:
    combined.loc[dict(example=example)] = create_an_xarray_dataset(example)

(Note that you always need to use indexing with square brackets like [] or .loc[] -- assigning with sel() and isel() doesn't work.)

These two approaches are equally efficient -- it's really a matter of taste which one looks better to you or works better for your application.

For what it's worth, pandas has the same limitation: the append method does indeed copy entire dataframes each time it is used. This is a perpetual surprise and source of performance issues for new users. So I do think that we made the right design decision not including it in xarray.

167

answered Oct 09 '22 13:10

shoyer

You can either use .concat or merge(). Documentation.

answered Oct 09 '22 13:10

bkaf

Related questions
                            
                                Python - Matrix outer product
                            
                                How to catch these exceptions individually?
                            
                                Remove colorbar's borders matplotlib
                            
                                How to evaluate single integrals of multivariate functions with Python's scipy.integrate.quad?
                            
                                Celery autodiscover_tasks not working for all Django 1.7 apps
                            
                                Initializer vs Constructor [duplicate]
                            
                                Extending the behavior of an inherited function in Python
                            
                                How to implement a priority queue using SQS(Amazon simple queue service)
                            
                                Python enumerate list setting start index but without increasing end count
                            
                                Itertools product without repeating duplicates
                            
                                django difference between clear() and delete()
                            
                                apply sort to a pandas groupby operation
                            
                                regex.sub() gives different results to re.sub()
                            
                                How to print result of clustering in sklearn
                            
                                How to plot non-square Seaborn jointplot or JointGrid
                            
                                Interleave list with fixed element
                            
                                Return Custom 404 Error when resource not found in Django Rest Framework
                            
                                Incremental PCA on big data
                            
                                Pandas Standard Deviation returns NaN
                            
                                How would I override the perform_destroy method in django rest framework?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it possible to append to an xarray.Dataset?

Tags:

python

pandas

numpy

python-xarray

Itay Lieder

People also ask

2 Answers

shoyer

bkaf

Recent Activity

Donate For Us