I have a variable that i want to forecast until for the next 30 years. Unfortunately i don't have many samples.
df = pd.DataFrame({'FISCAL_YEAR': [1979,1980,1981,1982,1983, 1984,
1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994,
1995, 1996,
1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016,
2017, 2018, 2019],
'VALS': [1341.9, 1966.95, 2085.75, 2087.1000000000004, 2760.75,
3461.4, 3156.3, 3061.8, 2309.8500000000004, 2320.65, 2535.3,
2964.6000000000004, 2949.75, 2339.55,
2327.4, 2571.75, 2299.05, 1560.6000000000001, 1370.25, 1301.4,
1215.0, 5691.6, 6281.55, 6529.950000000001, 17666.100000000002,
14467.95, 15205.050000000001, 14717.7, 14426.1, 12946.5,
13000.5, 12761.550000000001, 13076.1, 13444.650000000001,
13444.650000000001, 13321.800000000001, 13536.45, 13331.25,
12630.6, 12741.300000000001, 12658.95]})
Here is my code:
def build_model(n_neurons,dropout,s):
lstm = Sequential()
if cudnn:
lstm.add(CuDNNLSTM(n_neurons))
n_epochs = 200
else:
lstm.add(Masking(mask_value=-1,input_shape=(s[1],s[2])))
lstm.add(LSTM(n_neurons,dropout=dropout))
n_epochs = 500
lstm.add(Dense(1))
#lstm.add(Activation('softmax'))
lstm.compile(loss='mean_squared_error',optimizer='adam')
return lstm
def create_df(dfin,fwd,lstmws):
''' Input Normalization '''
idx = dfin.FISCAL_YEAR.values[fwd:]
dfx = dfin[[varn]].copy()
dfy = dfin[[varn]].copy()
# LSTM window - use last lstmws values
for i in range(0,lstmws-1):
dfx = dfx.join(dfin[[varn]].shift(-i-1),how='left',rsuffix='{:02d}'.format(i+1))
dfx = (dfx-vmnx).divide(vmxx-vmnx)
dfx.fillna(-1,inplace=True) # replace missing values with -1
dfy = (dfy-vmnx).divide(vmxx-vmnx)
dfy.fillna(-1,inplace=True) # replace missing values with -1
return dfx,dfy,idx
def forecast(dfin,dfx,lstm,idx,gapyr=1):
''' Model Forecast '''
xhat = dfx.values
xhat = xhat.reshape(xhat.shape[0],lstmws,int(xhat.shape[1]/lstmws))
yhat = lstm.predict(xhat)
yhat = yhat*(vmxx-vmnx)+vmnx
dfout = pd.DataFrame(list(zip(idx+gapyr,yhat.reshape(1,-1)[0])),columns=['FISCAL_YEAR',varn])
dfout = pd.concat([dfin.head(1),dfout],axis=0).reset_index(drop=True)
#append last prediction to X and use for prediction
dfin = pd.concat([dfin,dfout.tail(1)],axis=0).reset_index(drop=True)
return dfin
def lstm_training(dfin,lstmws,fwd,num_years,batchsize=4,cudnn=False,n_neurons=47,dropout=0.05,retrain=False):
''' LSTM Parameter '''
seed(2018)
set_random_seed(2018)
gapyr = 1 # Forecast +1 Year
dfx,dfy,idx = create_df(dfin,fwd,lstmws)
X,y = dfx.iloc[fwd:-gapyr].values,dfy[fwd+gapyr:].values[:,0]
X,y = X.reshape(X.shape[0],lstmws,int(X.shape[1]/lstmws)),y.reshape(len(y), 1)
lstm = build_model(n_neurons,dropout,X.shape)
''' LSTM Training Start '''
if batchsize == 1:
history_i =
lstm.fit(X,y,epochs=25,batch_size=batchsize,verbose=0,shuffle=False)
else:
history_i = lstm.fit(X,y,epochs=n_epochs,batch_size=batchsize,verbose=0,shuffle=False)
dfin = forecast(dfin,dfx,lstm,idx)
lstm.reset_states()
if not retrain:
for fwd in range(1,num_years):
dfx,dfy,idx = create_df(dfin,fwd,lstmws)
dfin = forecast(dfin,dfx,lstm,idx)
lstm.reset_states()
del dfy,X,y,lstm
gc.collect();
clear_session();
return dfin,history_i
varn = "VALS"
#LSTM-window
lstmws = 10
vmnx,vmxx = df[varn].astype(float).min(),df[varn].astype(float).max()
dfin,history_i = lstm_training(dfin,lstmws,0,2051-2018)
In my first version i retrained the model every time after appending the new prediction, and the predictions never converged. But because it's very time consuming to train after every new observation, i had to change.
My result:
dfin.VALS.values
array([ 1341.9 , 1966.95 , 2085.75 , 2087.1 ,
2760.75 , 3461.4 , 3156.3 , 3061.8 ,
2309.85 , 2320.65 , 2535.3 , 2964.6 ,
2949.75 , 2339.55 , 2327.4 , 2571.75 ,
2299.05 , 1560.6 , 1370.25 , 1301.4 ,
1215. , 5691.6 , 6281.55 , 6529.95 ,
17666.1 , 14467.95 , 15205.05 , 14717.7 ,
14426.1 , 12946.5 , 13000.5 , 12761.55 ,
13076.1 , 13444.65 , 13444.65 , 13321.8 ,
13536.45 , 13331.25 , 12630.6 , 12741.3 ,
12658.95 , 10345.97167969, 12192.12792969, 13074.4296875 ,
13264.40917969, 12956.1796875 , 12354.1953125 , 11659.03125 ,
11044.06933594, 10643.19921875, 10552.52246094, 10552.52246094,
10552.52246094, 10552.52246094, 10552.52246094, 10552.52246094,
10552.52246094, 10552.52246094, 10552.52246094, 10552.52246094,
10552.52246094, 10552.52246094, 10552.52246094, 10552.52246094,
10552.52246094, 10552.52246094, 10552.52246094, 10552.52246094,
10552.52246094, 10552.52246094, 10552.52246094, 10552.52246094,
10552.52246094, 10552.52246094])
How can I avoid to get the same prediction for the last 20+ years?
EDIT:
I prepended more random data to see if it is because of the little sample size, but the predictions are again constant after a while.
df0 = pd.DataFrame([range(1900,1979),list(np.random.rand(1979-1900)*(vmxx-vmnx)+vmnx)],index=["FISCAL_YEAR","VALS"]).T
df = pd.concat([df0,df])
df["FISCAL_YEAR"] = df["FISCAL_YEAR"].astype(int)
df.index = range(1900,2020)
A strange thing I have observed is that the predictions are the same after 10 years, i.e. the window size, but if I increase lstmws to 20, the predictions converge after 20 years:
lstmws = 20
Result:
{'FISCAL_YEAR': [2020, 2021, 2022, 2023, 2024, 2025, 2026, 027, 028, 2029, 2030, 2031, 2032, 2033, 2034, 2035, 2036, 2037, 2038, 039, 2040, 2041, 2042, 2043, 2044, 2045, 2046, 2047, 2048, 2049, 050, 2051, 2052],
'VALS': [11183.32421875, 12388.28125, 13151.013671875, 12543.6796875, 2590.0888671875, 12002.583984375, 11822.8857421875, 11479.6572265625, 1423.1279296875, 11444.5751953125, 11506.60546875, 11563.3173828125, 1595.0029296875, 11599.8955078125, 11586.8037109375, 11571.337890625, 1574.541015625, 11620.7900390625, 11734.2431640625, 11934.216796875, 1934.216796875, 11934.216796875, 11934.216796875, 11934.216796875, 1934.216796875, 11934.216796875, 11934.216796875, 11934.216796875, 1934.216796875, 11934.216796875, 11934.216796875, 11934.216796875, 1934.216796875]}
In My experience with LSTM's (I've been generating dance sequences like this), I've found two things in particular help prevent the model from stagnating and predicting the same output.
First, it's helpful to use a Mixture Density Network instead of an L2 loss (as you have). Read Christopher Bishop's paper on MDN layers for the details, but basically the L2 loss tries to predict the conditional average of some input's error terms as y. If for one value x you have multiple possible outputs y0, y1, y2, each with some probability (as many complex systems will), you'll want to consider the MDN layer and a negative log likelihood loss. Here is a Keras implementation that I'm using.
Reading your situation a little more closely now, this may not be helpful for your case, as you seem to be predicting a time series for which by definition each x maps to a single y.
Next, I've found it helpful to feed my LSTM n
sequence values prior to the one I'm trying to predict. The larger n is, the better the results I've found (though slower the training goes). Many papers I've read use 1024 prior sequence values to predict the next sequence value.
You don't have many observations, but you could try feeding the prior 8 observations in to predict the next observation.
Finally, I've ended up here after several years because I was training a model with a categorical crossentropy loss and one hot vectors as input. When I was generating sequences with my trained model, I was using:
# this predicts the same value over and over
predict_length = 100
sequence = X[0]
for i in range(predict_length):
# note that z is a dense vector -- it needs to be converted to one hot!
z = model.predict( np.expand_dims( sequence[-sequence_length:], 0 ) )
sequence = np.vstack([sequence, z])
I should have been converting my output predictions to one hot vectors:
# this predicts new values :)
predict_length = 1000
sequence = X[0]
for i in range(predict_length):
# z is still a dense vector; we'll convert it to one-hot below
z = model.predict( np.expand_dims( sequence[-sequence_length:], 0 ) ).squeeze()
# let's convert z to a one hot vector to match the training data
prediction = np.zeros(len(types),)
prediction[ np.argmax(z) ] = 1
sequence = np.vstack([sequence, prediction])
I suspect this last step is the reason most people will end up at this thread!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With