This code creates an dataframe with 10 minute range index:
import pandas as pd
import datetime as dt
date_range = pd.date_range(end=dt.datetime(2017, 1, 6, 15, 00), periods=10, freq='10Min')
df = pd.DataFrame(index=date_range)
df['A'] = 1
print(df)
It outputs:
A
2017-01-06 13:30:00 1
2017-01-06 13:40:00 1
2017-01-06 13:50:00 1
2017-01-06 14:00:00 1
2017-01-06 14:10:00 1
2017-01-06 14:20:00 1
2017-01-06 14:30:00 1
2017-01-06 14:40:00 1
2017-01-06 14:50:00 1
2017-01-06 15:00:00 1
My question is:
How may I set A column to 0 in the following three rows when there is a gap between the indexes?
For example, if we delete an specific row:
df = df[df.index != dt.datetime(2017, 1, 6, 14, 00)]
It outputs:
A
2017-01-06 13:30:00 1
2017-01-06 13:40:00 1
2017-01-06 13:50:00 1
2017-01-06 14:10:00 1
2017-01-06 14:20:00 1
2017-01-06 14:30:00 1
2017-01-06 14:40:00 1
2017-01-06 14:50:00 1
2017-01-06 15:00:00 1
Now, there is a missing 10 minute range before 13:50, so the following 3 A rows must be setted to 0.
So this would be the desired result:
A
2017-01-06 13:30:00 1
2017-01-06 13:40:00 1
2017-01-06 13:50:00 1
2017-01-06 14:10:00 0
2017-01-06 14:20:00 0
2017-01-06 14:30:00 0
2017-01-06 14:40:00 1
2017-01-06 14:50:00 1
2017-01-06 15:00:00 1
There is a python fiddle so you can try: https://repl.it/FaXZ/2
You can use:
#get mask where difference
mask = df.index.to_series().diff() > pd.Timedelta('00:10:00')
#get position of index where True in mask
idx = mask.idxmax()
pos = df.index.get_loc(idx)
#add values by position
df.A.iloc[pos:pos + 2] = 0
print (df)
A
2017-01-06 13:30:00 1
2017-01-06 13:40:00 1
2017-01-06 13:50:00 1
2017-01-06 14:10:00 0
2017-01-06 14:20:00 0
2017-01-06 14:30:00 1
2017-01-06 14:40:00 1
2017-01-06 14:50:00 1
2017-01-06 15:00:00 1
df.A.iloc[pos:pos + 5] = 0
print (df)
A
2017-01-06 13:30:00 1
2017-01-06 13:40:00 1
2017-01-06 13:50:00 1
2017-01-06 14:10:00 0
2017-01-06 14:20:00 0
2017-01-06 14:30:00 0
2017-01-06 14:40:00 0
2017-01-06 14:50:00 0
2017-01-06 15:00:00 1
temp = df.index.to_series().diff() > pd.Timedelta('00:10:00')
df['A'] = 1- (temp | temp.shift(1)).astype(int)
will result in
A
2017-01-06 13:30:00 1
2017-01-06 13:40:00 1
2017-01-06 13:50:00 1
2017-01-06 14:10:00 0
2017-01-06 14:20:00 0
2017-01-06 14:30:00 1
2017-01-06 14:40:00 1
2017-01-06 14:50:00 1
2017-01-06 15:00:00 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With