Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimising iterative computation of values based on growth rate

This is my dataframe:

Date              A          new_growth_rate
2011/01/01      100             
2011/02/01      101             
.
2012/01/01      120            0.035
2012/02/01      121            0.035
.
2013/01/01      131            0.036
2013/01/01      133            0.038

This is what I need:

Date              A          new_growth_rate
2011/01/01      100             
2011/02/01      101             
.
.
2012/01/01      103.62          .035   A=100/(1-0.035)
2012/02/01      104.66          .035   A=101/(1-0.035)
.
.
2013/01/01     107.49           .036   A=103.62/(1-0.036)
2013/02/01     108.68           .038   A=104.66/(1-0.038)

I need to Calculate Value based on growth rate for each column I have a dataframe with 400 columns and their corresponding growth rate.

I have calculate the growth rate using the following formula: (one year old value)*(1+current month growth rate). this calculated value will be used in get next year value and so forth. Like this I have 400 columns and their corresponding growth rate. The time series has 30 years of data

Currently I am using 2 for loop one to get each column and then second to iterate over the time period for each column and get values calculated in previous for loop. It takes couple of hours to go over 500 rows and 400 columns dataset. Is there a better way for this?`

My code snippet is below:

grpby=list of column in dataframe

df_new=pd.DataFrame()
for i,row in grpby.iterrows():
    df_csr=grwth.loc[(grwth['A']==row['A'])].copy()
        a = pd.to_datetime("2011-12-01",format='%Y-%m-%d')
        b = a
        while b <a+relativedelta.relativedelta(months=420):
            b=b+relativedelta.relativedelta(months=1)
            val= df_csr.loc[df_csr['Date']==(b+relativedelta.relativedelta(months=-12))].copy()
            val2=val.get_value(val.index[0],'Val')
            grwth_r=df_csr.loc[df_csr['date']==b]['new_growth_rate'].copy()
            grwth_r2=grwth_r.get_value(grwth_r.index[0],'new_growth_rate')
            df_csr.loc[df_csr['Date']==b,'Val']=val2/(1-grwth_r2)
        df_new=pd.concat([df_new,df_csr])
like image 334
Sanjay Avatar asked Nov 30 '16 04:11

Sanjay


1 Answers

You can use the year value as an index and then use a simple for loop to assign the data i.e

df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
years = (df.index.year).unique()

for i,j in enumerate(years):
    if i != 0:   
        prev = df.loc[df.index.year == years[i-1]]
        curr = df.loc[df.index.year == j]
        df.loc[df.index.year == j,'A'] = prev['A'].values/(1-curr['new_growth_rate'].values)

Output :

                    A  new_growth_rate
Date                                   
2011-01-01  100.000000              NaN
2011-02-01  101.000000              NaN
2012-01-01  103.626943            0.035
2012-02-01  104.663212            0.035
2013-01-01  107.496829            0.036
2013-01-01  108.797518            0.038

Hope it helps

like image 148
Bharath Avatar answered Nov 15 '22 00:11

Bharath