Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to upsample xarray by seconds and include bounding hours

I have a xarray.DataArray with a coordinate like

ary["time"] = [
    "2000-01-01T03:04:05",  # leading records are missing,
    "2000-01-01T03:04:06",
    "2000-01-01T03:04:08",  # some medium records are missing,
    "2000-01-01T03:04:09",
    "2000-01-01T03:04:11",
    ...
    "2000-01-01T06:54:02",
    "2000-01-01T06:54:03"   # and trailing records are missing.
]

and want to re-index to

ary["time"] = [
    "2000-01-01T03:00:00",
    "2000-01-01T03:00:01",
    "2000-01-01T03:00:02",
    ...
    "2000-01-01T03:04:06",
    "2000-01-01T03:04:07",
    "2000-01-01T03:04:08",
    "2000-01-01T03:04:09",
    ...
    "2000-01-01T06:59:57",
    "2000-01-01T06:59:58",
    "2000-01-01T06:59:59"
]

and set NaN at all missing records.

I found ary = ary.resample(time="1S").asfreq() but it only inserts medium records.

How can I indicate that left and right bounds are every hours? (or minutes or days?)


Sample (taken from gist):

from datetime import datetime, timedelta

import numpy as np
import pandas as pd
import xarray as xr


def make_ary():
    time = []
    for i in range(300, 14000):
        if i % 3 != 2 and i % 5 != 2:
            time.append(datetime(2000, 1, 1, 3, 0, 0) + timedelta(seconds=i))

    data = np.random.rand(len(time))
    return xr.DataArray(data=data, coords=[("time", time)], dims=["time"])


def make_expected():
    expected = []
    for i in range(0, 4*60*60):
        expected.append(
            datetime(2000, 1, 1, 3, 0, 0) + timedelta(seconds=i)
        )
    return pd.to_datetime(np.array(expected))


def make_not_expected():
    '''
    result of 'inserts medium records'
    '''
    not_expected = []
    for i in range(300, 14000):
        not_expected.append(
            datetime(2000, 1, 1, 3, 0, 0) + timedelta(seconds=i)
        )
    return pd.to_datetime(np.array(not_expected))


def resample(ary):
    return ary.resample(time="1S").asfreq()


def main():
    ary = make_ary()
    expected = make_expected()
    not_expected = make_not_expected()

    print(np.array_equal(ary["time"].values, expected))  # False

    ary = resample(ary)
    print(np.array_equal(ary["time"], expected))      # False
    print(np.array_equal(ary["time"], not_expected))  # True, but not expected


main()
like image 255
v..snow Avatar asked Mar 16 '26 22:03

v..snow


1 Answers

Use DataArray.reindex (Documentation)

In this scenario particularly, DataArray.reindex might be a better choice.

In the code sample below, the date range of the target array is specified with date_range (note that the parameter closed is set to "left" as we don't want the range to include "2000-01-01T07:00:00".

start_time = "2000-01-01T03:00:00"
end_time = "2000-01-01T07:00:00"
new_ary = ary.reindex(time=pd.date_range(start=start_time,end=end_time,freq="1S",closed='left'))
print(ary)

This gives the following output:

<xarray.DataArray 'time' (time: 14400)>
array(['2000-01-01T03:00:00.000000000', '2000-01-01T03:00:01.000000000',
       '2000-01-01T03:00:02.000000000', ..., '2000-01-01T06:59:57.000000000',
       '2000-01-01T06:59:58.000000000', '2000-01-01T06:59:59.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01T03:00:00 ... 2000-01-01T06:59:59

By default, reindex fills missing values with NaN. The outputs for the test code below demonstrates that missing values between "2000-01-01T03:08:06" and "2000-01-01T03:08:09" are set to NaN for the new array.

print(ary[100:102])
# Non NaN values start from index 300 for new_ary
print(new_ary[486:490])

Outputs:

<xarray.DataArray (time: 2)>
array([0.25910861, 0.07897777])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01T03:08:06 2000-01-01T03:08:09
<xarray.DataArray (time: 4)>
array([0.25910861,        nan,        nan, 0.07897777])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01T03:08:06 ... 2000-01-01T03:08:09
like image 134
PIG208 Avatar answered Mar 18 '26 11:03

PIG208



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!