I want to search for a target value in a pandas dataframe column only in forward direction and if a bigger value found then I want to record the index difference as a result column. I have managed to do this with two inner for loops but it was painfully slow.
This is what I want to achieve in a simplified example.
import pandas as pd
d = {
    'Value'  : [8,9,10,12,16,13,11,7,12,18],
    'Target' : [12,12,11,15,19,11,16,11,17,18]
    }
df = pd.DataFrame(data=d)
>>> df
   Target  Value
0      12      8
1      12      9
2      11     10
3      15     12
4      19     16
5      11     13
6      16     11
7      11      7
8      17     12
9      18     18
Our first value is 8 and our target value for this is 12. We look forward in Value column for a value which surpass this target value. And we find it in row-4 with value 16. What I want to record is index difference which is 4-0=4.
Next value is 9, again target value is 12. We look forward in values and find row-4 again with value 16.Now index difference is 4-1=3
Lets jump to row-4. We start to looking for the target value starting from index 5 and forward. If there is no value found then result is 0.
This is the result column that I want to reach.
   Target  Value  Result
0      12      8       4
1      12      9       3
2      11     10       1
3      15     12       1
4      19     16       0
5      11     13       3
6      16     11       3
7      11      7       1
8      17     12       1
9      18     18       0
Can this be done without for loops?
Use numpy broadcasting for compare, set numpy upper triangular matrix to False, get first True indices by numpy.argmax, subtract by arange and set to 0 all negatives:
t = df['Target'].values[:, None]
v = df['Value'].values
m = v > t
m[np.tril_indices(m.shape[1])] = False
print (m)
[[False False False False  True  True False False False  True]
 [False False False False  True  True False False False  True]
 [False False False  True  True  True False False  True  True]
 [False False False False  True False False False False  True]
 [False False False False False False False False False False]
 [False False False False False False False False  True  True]
 [False False False False False False False False False  True]
 [False False False False False False False False  True  True]
 [False False False False False False False False False  True]
 [False False False False False False False False False False]]
a = np.argmax(m, axis=1) - np.arange(len(df))
print (a)
[ 4  3  1  1 -4  3  3  1  1 -9]
df['new'] = np.where(a > 0, a, 0)
print (df)
   Value  Target  new
0      8      12    4
1      9      12    3
2     10      11    1
3     12      15    1
4     16      19    0
5     13      11    3
6     11      16    3
7      7      11    1
8     12      17    1
9     18      18    0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With