Suppose I have a DataFrame
df = pd.DataFrame(dict(vals=np.random.randint(0, 10, 10)),
index=pd.date_range('20170401', '20170410'))
>>> df
vals
2017-04-01 9
2017-04-02 8
2017-04-03 4
2017-04-04 5
2017-04-05 9
2017-04-06 9
2017-04-07 5
2017-04-08 3
2017-04-09 3
2017-04-10 1
and a particular date which I know is in my index but do not know the position of, for example
cur_dt = df.index[np.random.randint(0, df.index.size)]
>>> cur_dt
Timestamp('2017-04-05 00:00:00', freq='D')
Given cur_dt
, I want to determine what the previous and next values in my index are. Should cur_dt
be the first (last) value in my index, then the previous (next) element should be cur_dt
itself.
To recap, my question is, what is the easiest way to find the previous and next value in my index (or my current value itself if it is an endpoint) given my current value?
My current approach seems rather roundabout, which is my motivation for asking.
cur_iloc = df.index.get_loc(cur_dt)
prev = cur_dt if cur_iloc == 0 else df.index[cur_iloc-1]
next = cur_dt if cur_iloc == df.index.size-1 else df.index[cur_iloc+1]
>>> prev
Timestamp('2017-04-04 00:00:00', freq='D')
>>> next
Timestamp('2017-04-06 00:00:00', freq='D')
If there's no more straightforward way after all then my apologies. I'm imagining being able to just "shift" my index from my current value once forwards and once backwards (with some nice treatment for endpoints), but am not sure if this is possible.
Assuming that the index is sorted, try to use numpy.searchsorted:
Source data sets:
In [185]: df
Out[185]:
vals
2017-04-01 5
2017-04-02 3
2017-04-03 9
2017-04-04 8
2017-04-05 1
2017-04-06 0
2017-04-07 4
2017-04-08 5
2017-04-09 1
2017-04-10 8
In [186]: cur_dt
Out[186]: Timestamp('2017-04-02 00:00:00', freq='D')
Solution:
In [187]: idx = np.searchsorted(df.index, cur_dt)
In [188]: df.index[max(0, idx-1)]
Out[188]: Timestamp('2017-04-01 00:00:00', freq='D')
In [189]: df.index[min(idx+1, len(df)-1)]
Out[189]: Timestamp('2017-04-03 00:00:00', freq='D')
Reset your index and then use your boolean logic to identify location of your cur_dt
like so:
df = df.reset_index()
cur_dt_index = df.index[np.random.randint(0, df['index'].size)]
previous = max(cur_dt_index-1, 0)
next = min(cur_dt_index + 1, df.shape[0])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With