Say I have the following dataframe:
import pandas as pd
df = pd.DataFrame()
df['A'] = ('1/05/2019','2/05/2019','3/05/2019','4/05/2019','5/05/2019','6/05/2019','7/05/2019','8/05/2019','9/05/2019','10/05/2019','11/05/2019','12/05/2019','13/05/2019','14/05/2019','15/05/2019','16/05/2019','17/05/2019','18/05/2019','19/05/2019','20/05/2019')
df['B'] = ('SIT','SCLOSE', 'SHODL', 'SHODL', 'SHODL', 'SHODL', 'SHODL', 'SELL','SIT','SIT','BCLOSE', 'BHODL', 'BHODL', 'BHODL', 'BHODL', 'BHODL', 'BHODL','BUY','SIT','SIT')
df['C'] = (0.00,1.00,10.00, 5.00,6.00,-6.00, 6.00, 0.00,0.00,0.00,-8.00,33.00,-15.00,6.00,-1.00,5.00,10.00,0.00,0.00,0.00)
df.loc[19, 'D'] = 100.0000
As can be seen I am starting column D with 100 at the last row.
I am trying to code a calculation for column D so starting from the bottom row (row 19) when a BUY or SELL is shown on column B then the number on column D is locked (eg the 100) and used for a calculation based on col C for each SHODL or BHODL until the row after a BCLOSE or an SCLOSE is shown.
The locked number is used to calculate a running balance based on the percentages that are in column C. As you can see on row 16 column C has '10' representing 10%. As 10% of 100 = 10 the new runnning balance is 110.
Row 15 column C has 5% as such 5 is added to the running balance to result in 115.
The next row 14 column C has a -1% change as such 1% of 100 is = 1 and therefore the new running balance is 114 and so on.
The following are the results that should be returned in col D of the dataframe once the right code is run
df['D'] = ('158.60','158.60', '157.30', '144.30', '137.80', '130.00', '137.80', '130.00','130.00','130.00','130.00', '138.00', '105.00', '120.00', '114.00', '115.00', '110.00','100.00','100.00','100.00')
This continues until after a SCLOSE or a BCLOSE is shown as a BCLOSE or SCLOSE row is the final row where the running balance is calculated.
As you can see this process is restarted when either a new BUY or SELL is shown.
Next starting value depends on the last value of previous group, so I think it can't be vectorized. It requires some kind of iterative process. I came up with solution doing iteratively on groups of groupby. Reverse df
and assign to df1
. Working on each group of df1
and assign the final list of groups to the original df
df1 = df[::-1]
s = df1.B.isin(['BCLOSE','SCLOSE']).shift(fill_value=False).cumsum()
grps = df1.groupby(s)
init_val= 100
l = []
for _, grp in grps:
s = grp.C * 0.01 * init_val
s.iloc[0] = init_val
s = s.cumsum()
init_val = s.iloc[-1]
l.append(s)
df['D'] = pd.concat(l)
Out[50]:
A B C D
0 1/05/2019 SIT 0.0 158.6
1 2/05/2019 SCLOSE 1.0 158.6
2 3/05/2019 SHODL 10.0 157.3
3 4/05/2019 SHODL 5.0 144.3
4 5/05/2019 SHODL 6.0 137.8
5 6/05/2019 SHODL -6.0 130.0
6 7/05/2019 SHODL 6.0 137.8
7 8/05/2019 SELL 0.0 130.0
8 9/05/2019 SIT 0.0 130.0
9 10/05/2019 SIT 0.0 130.0
10 11/05/2019 BCLOSE -8.0 130.0
11 12/05/2019 BHODL 33.0 138.0
12 13/05/2019 BHODL -15.0 105.0
13 14/05/2019 BHODL 6.0 120.0
14 15/05/2019 BHODL -1.0 114.0
15 16/05/2019 BHODL 5.0 115.0
16 17/05/2019 BHODL 10.0 110.0
17 18/05/2019 BUY 0.0 100.0
18 19/05/2019 SIT 0.0 100.0
19 20/05/2019 SIT 0.0 100.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With