Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe calculations with previous row

I have the dataframe comprised of 'Linear' and 'Delta' and need to create the 'New' column.

In:

Linear   Delta
    30      -3  
    60     1.4
    65    -0.3
    62     4.4
    21    -2.5
    18    -0.1
    34    -3.1
    30    -1.5
    45     0.5
    55    -1.4
    43     2.8
    51     4.7
    62     2.7

Out:

Linear   Delta        New
    30      -3  
    60     1.4       60.0
    65    -0.3       59.7
    62     4.4       64.1
    21    -2.5       61.6
    18    -0.1       61.5
    34    -3.1       58.4
    30    -1.5       56.9
    45     0.5       57.4
    55    -1.4       55.0
    43     2.8       57.8
    51     4.7       51.0
    62     2.7       53.7

The algorithmic formula is the following one:

New[i] = IF( AND(Linear[i-1]<50,Linear[i]>50) ,  Linear ,  New[i-1]+Delta[i]  )

I tried a lot of different approaches such as cumsum() but never found the solution. I have spent many hours but in vain.

like image 632
user3203275 Avatar asked Mar 17 '26 15:03

user3203275


2 Answers

For this kind of recursive algorithm, consider numba with a manual loop. You will likely find just-in-time compilation more efficient than Pandas-based methods / iteration.

from numba import jit

@jit(nopython=True)
def calc_new(L, D):
    res = np.zeros(L.shape)
    res[0] = np.nan
    for i in range(1, len(res)):
        res[i] = L[i] if (L[i-1] < 50) & (L[i] > 50) else res[i-1] + D[i]
    return res

df['New'] = calc_new(df['Linear'].values, df['Delta'].values)

Result

print(df)

    Linear  Delta   New
0       30   -3.0   NaN
1       60    1.4  60.0
2       65   -0.3  59.7
3       62    4.4  64.1
4       21   -2.5  61.6
5       18   -0.1  61.5
6       34   -3.1  58.4
7       30   -1.5  56.9
8       45    0.5  57.4
9       55   -1.4  55.0
10      43    2.8  57.8
11      51    4.7  51.0
12      62    2.7  53.7
like image 69
jpp Avatar answered Mar 20 '26 06:03

jpp


Not very nice, but working:

df['NEW'] = np.nan
for i, row in df.iterrows():
    if i > 0:
        m = (row['Linear'] > 50) & (df.loc[i-1, 'Linear'] < 50)
        df.loc[i, 'NEW'] = np.where(m, row['Linear'], row['Delta'] + df.loc[i-1, 'NEW'])

print (df)
    Linear  Delta   New   NEW
0       30   -3.0   NaN   NaN
1       60    1.4  60.0  60.0
2       65   -0.3  59.7  59.7
3       62    4.4  64.1  64.1
4       21   -2.5  61.6  61.6
5       18   -0.1  61.5  61.5
6       34   -3.1  58.4  58.4
7       30   -1.5  56.9  56.9
8       45    0.5  57.4  57.4
9       55   -1.4  55.0  55.0
10      43    2.8  57.8  57.8
11      51    4.7  51.0  51.0
12      62    2.7  53.7  53.7
like image 26
jezrael Avatar answered Mar 20 '26 04:03

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!