Is there a simple way of flattening an xarray dataset into a single 1D numpy array?
For example, flattening the following test dataset:
xr.Dataset({
'a' : xr.DataArray(
data=[10,11,12,13,14],
coords={'x':[0,1,2,3,4]},
dims={'x':5}
),
'b' : xr.DataArray(data=1,coords={'y':0}),
'c' : xr.DataArray(data=2,coords={'y':0}),
'd' : xr.DataArray(data=3,coords={'y':0})
})
to
[10,11,12,13,14,1,2,3]
?
If you're OK with repeated values, you can use .to_array()
and then flatten the values in NumPy, e.g.,
>>> ds.to_array().values.ravel()
array([10, 11, 12, 13, 14, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3,
3, 3, 3])
If you don't want repeated values, then you'll need to write something yourself, e.g.,
>>> np.concatenate([v.values.ravel() for v in ds.data_vars.values()])
array([10, 11, 12, 13, 14, 1, 2, 3])
More generally, this sounds somewhat similar to a proposed interface for "stacking" data variables in 2D for machine learning applications: https://github.com/pydata/xarray/issues/1317
As of July 2019, xarray now has the functions to_stacked_array and to_unstacked_dataset that perform this function.
Get Dataset from question:
ds = xr.Dataset({
'a' : xr.DataArray(
data=[10,11,12,13,14],
coords={'x':[0,1,2,3,4]},
dims={'x':5}
),
'b' : xr.DataArray(data=1,coords={'y':0}),
'c' : xr.DataArray(data=2,coords={'y':0}),
'd' : xr.DataArray(data=3,coords={'y':0})
})
Get the list of data variables:
variables = ds.data_vars
Use the np.flatten()
method to reduce arrays to 1D:
arrays = [ ds[i].values.flatten() for i in variables ]
Then expand list of 1D arrays (as detailed in this answer):
arrays = [i for j in arrays for i in j ]
Now convert this to an array as requested in Q (as currently a list):
array = np.array(arrays)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With