I can't understand why I'm getting KeyError: Timestamp('...') when using loc on date index.
With given df: dtypes are datetime64[ns], int, int, DATE1 is index
DATE1 VALUE2 VALUE3
2021-08-20 00:00:00 11 424
2021-08-21 00:00:00 22 424
2021-08-22 00:00:00 33 424
2021-08-23 00:00:00 44 242
I'm trying to use loc on index like this:
start_date = date(2021-08-20)
end_date = date(2021-08-23)
df = df.loc[start_date:end_date]
and this is working fine. I'm getting 4 records. However when I do this:
start_date = date(2021-08-20)
end_date = date(2021-08-24) #end_date is higher than values in dataframe
df = df.loc[start_date:end_date]
I'm getting KeyError: KeyError: Timestamp('2021-08-24 00:00:00'). Could someone point me how to resolve this?
In order to use label-based slices with bounds outside of index range, the index must be monotonically increasing or decreasing.
From pandas docs:
If the index of a Series or DataFrame is monotonically increasing or decreasing, then the bounds of a label-based slice can be outside the range of the index, much like slice indexing a normal Python list. Monotonicity of an index can be tested with the is_monotonic_increasing() and is_monotonic_decreasing() attributes.
On the other hand, if the index is not monotonic, then both slice bounds must be unique members of the index.
You can use df.sort_index to sort the index and then out of bounds slices should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With