Let's say I have the following dataframe:
df = pd.DataFrame({'a': [10, 20, 30, 40, 50], 'b': [0, 10, 40, 45, 50]}, columns = ['a', 'b'])
I would like to make a list of indices where:
a [i - 1] < b[i] and a[i] >= b[i]
in order to detect when a value, in a timeseries, is crossing another one
is there a Pandas idiomatic way to achieve this without iterating through all the elements?
I tried to create a new column with flags to indicate a crossing by doing this:
df['t'] = (df['a'].shift(1).values < df['b'].values and di['a'].values >= df['b']).astype(bool)
but that won't compile. I'm not sure how to approach this problem, short of doing a loop through all the elements.
You can use the Series.shift with Series.lt which is "less than", same as < and Series.ge which is "greater than or equal" and is same as >=:
mask = df['a'].shift().lt(df['b']) & df['a'].ge(df['b'])
# same as (df['A'].shift() < df['b']) & (df['a'] >= df['b'])
0 False
1 False
2 False
3 False
4 True
dtype: bool
Notice, we don't have to specify astype(bool), pandas works with boolean indexing and returns booleans when defining conditions.
To get the indices of the rows with True, use:
idx = df[mask].index.tolist()
print(idx)
[4]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With