I am working on finding some sort of moving average in a dataframe. The formula will change based on the number of the row it is being computed for. The actual scenario is where I need to compute column Z.
Edit-2:
Below is the actual data I am working with
Date Open High Low Close
0 01-01-2018 1763.95 1763.95 1725.00 1731.35
1 02-01-2018 1736.20 1745.80 1725.00 1743.20
2 03-01-2018 1741.10 1780.00 1740.10 1774.60
3 04-01-2018 1779.95 1808.00 1770.00 1801.35
4 05-01-2018 1801.10 1820.40 1795.60 1809.95
5 08-01-2018 1816.00 1827.95 1800.00 1825.00
6 09-01-2018 1823.00 1835.00 1793.90 1812.05
7 10-01-2018 1812.05 1823.00 1801.40 1816.55
8 11-01-2018 1825.00 1825.05 1798.55 1802.10
9 12-01-2018 1805.00 1820.00 1794.00 1804.95
10 15-01-2018 1809.90 1834.45 1792.45 1830.00
11 16-01-2018 1835.00 1857.45 1826.10 1850.25
12 17-01-2018 1850.00 1852.45 1826.20 1840.50
13 18-01-2018 1840.50 1852.00 1823.50 1839.00
14 19-01-2018 1828.25 1836.35 1811.00 1829.50
15 22-01-2018 1816.50 1832.55 1805.50 1827.20
16 23-01-2018 1825.00 1825.00 1782.25 1790.15
17 24-01-2018 1787.80 1792.70 1732.15 1737.50
18 25-01-2018 1739.90 1753.40 1720.00 1726.40
19 29-01-2018 1735.15 1754.95 1729.80 1738.70
The code snippet I am using is as below:
from datetime import date
from nsepy import get_history
import csv
import pandas as pd
import numpy as np
import requests
from datetime import timedelta
import datetime as dt
import pandas_datareader.data as web
import io
df = pd.read_csv('ACC.CSV')
idx = df.reset_index().index
df['Change'] = df['Close'].diff()
df['Advance'] = np.where(df.Change > 0, df.Change,0)
df['Decline'] = np.where(df.Change < 0, df.Change*-1, 0)
conditions = [idx < 14, idx == 14, idx > 14]
values = [0, (df.Advance.rolling(14).sum())/14, (df.Avg_Gain.shift(1) * 13 + df.Advance)/14]
df['Avg_Gain'] = np.select(conditions, values)
df['Avg_Loss'] = (df.Decline.rolling(14).sum())/14
df['RS'] = df.Avg_Gain / df.Avg_Loss
df['RSI'] = np.where(df['Avg_Loss'] == 0, 100, 100-(100/(1+df.RS)))
df.drop(['Change', 'Advance', 'Decline', 'Avg_Gain', 'Avg_Loss', 'RS'], axis=1)
print(df.head(20))
Below is the error I am getting:
Traceback (most recent call last):
File "C:/Users/Lenovo/Desktop/Python/0.Chart Patterns/Z.Sample Code.py", line 20, in <module>
values = [0, (df.Advance.rolling(14).sum())/14, (df.Avg_Gain.shift(1) * 13 + df.Advance)/14]
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 3614, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'Avg_Gain'
Edit 3: Below is the expected output and then I will also write down the formula. Original DF consists of columns Date Open High Low & Close.
Advance and Decline –
• If the difference between current & previous is +ve then Advance = difference and Decline = 0
• If the difference between current & previous is –ve then Advance = 0 and Decline = -1 * difference
Avg_Gain:
• If index is < 13 then Avg_Gain = 0
• If index = 13 then Avg_Gain = Average of 14 periods
• If index > 13, then Avg_Gain = (Avg_Gain(previous-row) * 13 + Advance(current-row) )/14
Avg_Loss:
• If index is < 13 then Avg_Loss = 0
• If index = 13 then Avg_Loss = Average of Advance of 14 periods
• If index > 13, then Avg_Loss = (Avg_Loss(previous-row) * 13 + Decline(current-row) )/14
RS:
• If index < 13 then RS = 0
• If Index >= 13 then RS = Avg_Gain/Avg_Loss
RSI = 100-(100/(1 + RS))
I hope this helps.
You have an error in your code because you use df.Avg_Gain
in creating df.Avg_Gain
. Y
values = [0, (df.Advance.rolling(14).sum())/14, (df.Avg_Gain.shift(1) * 13 + df.Advance)/14]
df['Avg_Gain'] = np.select(conditions, values)
I changed that part of the code to the following:
up = df.Advance.rolling(14).sum()/14
values = [0, up, (up.shift(1) * 13 + df.Advance)/14]
Output (idx>=14):
Date Open High Low Close RSI
14 2018-01-19 1828.25 1836.35 1811.00 1829.50 75.237850
15 2018-01-22 1816.50 1832.55 1805.50 1827.20 72.920021
16 2018-01-23 1825.00 1825.00 1782.25 1790.15 58.793750
17 2018-01-24 1787.80 1792.70 1732.15 1737.50 40.573938
18 2018-01-25 1739.90 1753.40 1720.00 1726.40 31.900045
19 2018-01-29 1735.15 1754.95 1729.80 1738.70 33.197678
There should be a better way of doing this though. I'll update it with a better solution if i find one. Let me know if this data is correct.
UPDATE: You also need to correct your calculation for 'Avg_loss`::
down = df.Decline.rolling(14).sum()/14
down_values = [0, down, (down.shift(1) * 13 + df.Decline)/14]
df['Avg_Loss'] = np.select(conditions, down_values)
http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:relative_strength_index_rsi#calculation
UPDATE 2: After Expected data was provided. So the only way i could do this is by looping - not sure if possible to do it otherwise - maybe some pandas functionality i'm unaware of.
So first do the same as before with setting Avg_Gain
and Avg_Loss
:
you just need to change the values slightly:
conditions = [idx<13, idx==13, idx>13]
up = df.Advance.rolling(14).sum()/14
values = [0, up, 0]
df['Avg_Gain'] = np.select(conditions, values)
down = df.Decline.rolling(14).sum()/14
down_values = [0, down, 0]
df['Avg_Loss'] = np.select(conditions, d_values)
I have changed your conditions to split on index 13 - since this is what i see based on the expected output.
Once you run this code you will populate the values for Avg_Gain
and Avg_Loss
from index 14
using the previous value for Agv_Gain
and Avg_Loss
.
p=14
for i in range(p, len(df)):
df.at[i, 'Avg_Gain'] = ((df.loc[i-1, 'Avg_Gain'] * (p-1)) + df.loc[i, 'Advance']) / p
df.at[i, 'Avg_Loss'] = ((df.loc[i-1, 'Avg_Loss'] * (p-1)) + df.loc[i, 'Decline']) / p
Output:
df[13:][['Date','Avg_Gain', 'Avg_Loss', 'RS', 'RSI']]
Date Avg_Gain Avg_Loss RS RSI
13 2018-01-18 10.450000 2.760714 3.785252 79.102460
14 2018-01-19 9.703571 3.242092 2.992997 74.956155
15 2018-01-22 9.010459 3.174800 2.838119 73.945571
16 2018-01-23 8.366855 5.594457 1.495562 59.928860
17 2018-01-24 7.769222 8.955567 0.867530 46.453335
18 2018-01-25 7.214278 9.108741 0.792017 44.196960
19 2018-01-29 7.577544 8.458116 0.895890 47.254330
You can do it by using ewm
and also go have a look at there for a bit more explanation. You will find that ewm
can be used to calculate the y(row)=α*x(row)+(1−α)*y(row−1), where for example y is the column Avg_Gain, x is the value of the column Advance and α is the weight to give to x(row)
# define the number for the window
win_n = 14
# create a dataframe df_avg with the non null value of the two columns
# Advance and Decline such as the first is the mean of the 14 first values and the rest as normal
df_avg = (pd.DataFrame({'Avg_Gain': np.append(df.Advance[:win_n].mean(), df.Advance[win_n:]),
'Avg_Loss': np.append(df.Decline[:win_n].mean(), df.Decline[win_n:])},
df.index[win_n-1:])
.ewm(adjust=False, alpha=1./win_n).mean()) # what you need to calculate with your formula
# create the two other columns RS and RSI
df_avg['RS'] = df_avg.Avg_Gain / df_avg.Avg_Loss
df_avg['RSI'] = 100.-(100./(1. + df_avg['RS']))
and df_avg
looks like:
Avg_Gain Avg_Loss RS RSI
13 10.450000 2.760714 3.785252 79.102460
14 9.703571 3.242092 2.992997 74.956155
15 9.010459 3.174800 2.838119 73.945571
16 8.366855 5.594457 1.495562 59.928860
17 7.769222 8.955567 0.867530 46.453335
18 7.214278 9.108741 0.792017 44.196960
19 7.577544 8.458116 0.895890 47.254330
You can join
it to the original data and fillna
with 0:
df = df.join(df_avg).fillna(0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With