I have a DataFrame that looks like:
a b
1 0.9 0.796522123
2 0.8 0.701075019
3 0.6 0.777130253
4 0.5 0.209912906
5 0.75 0.920537662
6 1 0.955212665
7 3.5 0.227221963
8 2 0.336632891
9 1.25 0.563511758
10 1 0.832624112
I want to create a moving average with a maximum period of 3, where each observation is df['a']*df['b
].
If df['a'].rolling(window=3).sum() <= 3
, then the MA would be:
df['MA'] = (df['a']*df['b']).rolling(window=3).mean()
.
However, for example, if df['a'].rolling(window=3).sum() > 3
, as would be the case for df[8:10]
, then I would like the moving average to be:
((1*0.83)+(1.25*0.832624112)+(0.75*0.336632891))/3
.
I've been toying with creating a function and then applying it, something like:
def MA(a, b, period):
total = 0
sum_a = 0
for i in (b):
if sum_a < period:
sum_a += a
total += (a*b)
else:
sum_a = sum_a - a
diff = period - sum_a
total = total + (diff*b)
return total
df['MA'] = df.apply(lambda x: MA(a, b, period), axis=1)
I can't get this to work and I assume there is an easier way using pandas
or numpy
to do this.
Many thanks in advance.
def rolling_mean(x):
if (x.name <3):
return np.nan
a1 = x.a if x.a <=3 else 3
#look at previous 2 rows and reduce the 'a' value if the sum of the 3 rows is greater than 3.
a2 = df.loc[x.name-1].a if (a1+df.loc[x.name-1].a) <=3 else 3-a1
a3 = df.loc[x.name-2].a if (a1+a2+df.loc[x.name-2].a) <=3 else 3-a1-a2
#calculate mean of adjusted a * b for the current row and 2 rows above.
return (a1*x.b + a2*df.loc[x.name-1].b + a3*df.loc[x.name-2].b)/3.0
#apply the rolling_mean to the dataframe to create a new column 'MA'
df['MA'] = df.apply(func=rolling_mean,axis=1)
df
Out[500]:
a b MA
1 0.90 0.796522 NaN
2 0.80 0.701075 NaN
3 0.60 0.777130 0.581336
4 0.50 0.209913 0.377365
5 0.75 0.920538 0.420546
6 1.00 0.955213 0.583524
7 3.50 0.227222 0.227222
8 2.00 0.336633 0.300163
9 1.25 0.563512 0.431166
10 1.00 0.832624 0.596496
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With