I want to do calculations on three columns of a dataframe df
. In order to do that I want run a price of assets (cryptocurrencies) list in a three column table in order to calculate the exponential moving average of them after having enough data.
def calculateAllEMA(self,values_array):
df = pd.DataFrame(values_array, columns=['BTC', 'ETH', 'DASH'])
column_by_search = ["BTC", "ETH", "DASH"]
print(df)
for i,column in enumerate(column_by_search):
ema=[]
# over and over for each day that follows day 23 to get the full range of EMA
for j in range(0, len(column)-24):
# Add the closing prices for the first 22 days together and divide them by 22.
EMA_yesterday = column.iloc[1+j:22+j].mean()
k = float(2)/(22+1)
# getting the first EMA day by taking the following day’s (day 23) closing price multiplied by k, then multiply the previous day’s moving average by (1-k) and add the two.
ema.append(column.iloc[23 + j]*k+EMA_yesterday*(1-k))
print("ema")
print(ema)
mean_exp[i] = ema[-1]
return mean_exp
Yet, when I print what's in len(column)-24
I get -21 (-24 + 3 ?). I can't therefore go through the loop. How can I cope with this error to get exponential moving average of the assets ?
I tried to apply this link from iexplain.com for the pseudo code of the exponential moving average.
If you have any easier idea, I'm open to hear it.
Here is the data that I use to calculate it when it bugs :
BTC ETH DASH
0 4044.59 294.40 196.97
1 4045.25 294.31 196.97
2 4044.59 294.40 196.97
3 4045.25 294.31 196.97
4 4044.59 294.40 196.97
5 4045.25 294.31 196.97
6 4044.59 294.40 196.97
7 4045.25 294.31 196.97
8 4045.25 294.31 196.97
9 4044.59 294.40 196.97
10 4045.25 294.31 196.97
11 4044.59 294.40 196.97
12 4045.25 294.31 196.97
13 4045.25 294.32 197.07
14 4045.25 294.31 196.97
15 4045.41 294.46 197.07
16 4045.25 294.41 197.07
17 4045.41 294.41 197.07
18 4045.41 294.47 197.07
19 4045.25 294.41 197.07
20 4045.25 294.32 197.07
21 4045.43 294.35 197.07
22 4045.41 294.46 197.07
23 4045.25 294.41 197.07
Finally, the following formula is used to calculate the current EMA: EMA = Closing price x multiplier + EMA (previous day) x (1-multiplier)
Exponential Moving Averages (EMA) is a type of Moving Averages. It helps users to filter noise and produce a smooth curve. In Moving Averages 2 are very popular. Simple Moving Average just calculates the average value by performing a mean operation on given data but it changes from interval to interval.
In Python, we can calculate the moving average using . rolling() method. This method provides rolling windows over the data, and we can use the mean function over these windows to calculate moving averages. The size of the window is passed as a parameter in the function .
pandas.stats.moments.ewma
from the original answer has been deprecated.
Instead you can use pandas.DataFrame.ewm
as documented here.
Below is a complete snippet with random data that builds a dataframe with calculated ewmas from specified columns.
Code:
# imports
import pandas as pd
import numpy as np
np.random.seed(123)
rows = 50
df = pd.DataFrame(np.random.randint(90,110,size=(rows, 3)), columns=['BTC', 'ETH', 'DASH'])
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
def ewmas(df, win, keepSource):
"""Add exponentially weighted moving averages for all columns in a dataframe.
Arguments:
df -- pandas dataframe
win -- length of ewma estimation window
keepSource -- True or False for keep or drop source data in output dataframe
"""
df_temp = df.copy()
# Manage existing column names
colNames = list(df_temp.columns.values).copy()
removeNames = colNames.copy()
i = 0
for col in colNames:
# Make new names for ewmas
ewmaName = colNames[i] + '_ewma_' + str(win)
# Add ewmas
#df_temp[ewmaName] = pd.stats.moments.ewma(df[colNames[i]], span = win)
df_temp[ewmaName] = df[colNames[i]].ewm(span = win, adjust=True).mean()
i = i + 1
# Remove estimates with insufficient window length
df_temp = df_temp.iloc[win:]
# Remove or keep source data
if keepSource == False:
df_temp = df_temp.drop(removeNames,1)
return df_temp
# Test run
df_new = ewmas(df = df, win = 22, keepSource = True)
print(df_new.tail())
Output:
BTC ETH DASH BTC_ewma_22 ETH_ewma_22 DASH_ewma_22
dates
2017-02-15 91 96 98 98.752431 100.081052 97.926787
2017-02-16 100 102 102 98.862445 100.250270 98.285973
2017-02-17 100 107 97 98.962634 100.844749 98.172712
2017-02-18 103 102 91 99.317826 100.946384 97.541684
2017-02-19 99 104 91 99.289894 101.214755 96.966758
Plot using df_new[['BTC', 'BTC_ewma_22']].plot()
:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With