I have a dataset with int
s in them, and I'd like to select a subdataset by some criteria but I would like to preserve the integer datatype. It seems to me that Xarray force-changes the integer data to float datatype.
import numpy
import xarray
nums = numpy.random.randint(0, 100, 13)
names = numpy.random.choice(["babadook", "samara", "jason"], 13)
data_vars = {"num": xarray.DataArray(nums), "name": xarray.DataArray(names)}
dataset = xarray.Dataset(data_vars)
print(dataset)
<xarray.Dataset>
Dimensions: (dim_0: 13)
Coordinates:
* dim_0 (dim_0) int64 0 1 2 3 4 5 6 7 8 9 10 11 12
Data variables:
num (dim_0) int64 93 99 49 35 92 14 41 57 28 59 74 1 15
name (dim_0) <U8 'babadook' 'samara' 'samara' 'samara' 'jason' ...
In [16]:
subdataset = dataset.where(dataset.num < 50, drop=True)
print(subdataset)
<xarray.Dataset>
Dimensions: (dim_0: 7)
Coordinates:
* dim_0 (dim_0) int64 2 3 5 6 8 11 12
Data variables:
num (dim_0) float64 49.0 35.0 14.0 41.0 28.0 1.0 15.0
name (dim_0) <U32 'samara' 'samara' 'jason' 'babadook' 'jason' ...
That's because with numpy (which xarray uses under-the-hood) ints don't have a way of representing NaN
s. So with most where
results, the type needs to be coerced to floats.
If drop=True
and every value that is masked is dropped, that's not actually a constraint - you could have the new array retain its dtype, because there's no need for NaN
values. That's not in xarray at the moment, but could be an additional feature.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With