Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Length of endogenous variable must be larger the the number of lags used

I was recently following this tutorial on Time Series Analysis in Python by Susan Li. I am fitting a time series SARIMAX model on the following series:

y['2017':]

OUT: 
Order Date
2017-01-01     397.602133
2017-02-01     528.179800
2017-03-01     544.672240
2017-04-01     453.297905
2017-05-01     678.302328
2017-06-01     826.460291
2017-07-01     562.524857
2017-08-01     857.881889
2017-09-01    1209.508583
2017-10-01     875.362728
2017-11-01    1277.817759
2017-12-01    1256.298672
Freq: MS, Name: Sales, dtype: float64

using the following:

mod = sm.tsa.statespace.SARIMAX(y,
                                order=(1, 1, 1),
                                seasonal_order=(1, 1, 0, 12),
                                enforce_stationarity=False,
                                enforce_invertibility=False)

results = mod.fit()

print(results.summary().tables[1])

Now, this works well until here, but then when I try to visualize the results, I obtain the following error:

results.plot_diagnostics(figsize=(16, 8))
OUT: 
ValueError                                Traceback (most recent call last)
<ipython-input-16-6cfeaa52b7c1> in <module>
----> 1 results.plot_diagnostics(figsize=(16, 8))
      2 plt.show()

~/opt/anaconda3/lib/python3.8/site-packages/statsmodels/tsa/statespace/mlemodel.py in plot_diagnostics(self, variable, lags, fig, figsize, truncate_endog_names)
   4282 
   4283         if resid.shape[0] < max(d, lags):
-> 4284             raise ValueError(
   4285                 "Length of endogenous variable must be larger the the number "
   4286                 "of lags used in the model and the number of observations "

ValueError: Length of endogenous variable must be larger the the number of lags used in the model and the number of observations burned in the log-likelihood calculation.

<Figure size 1152x576 with 0 Axes>

Does anyone have any idea how to fix this, if it's some kind of library error, and if it cannot directly fixed, then how can I obtain all the diagnostic plots?

like image 699
J.P. Avatar asked Oct 14 '20 13:10

J.P.


People also ask

What are endogenous variables?

An endogenous variable is a variable in a statistical model that's changed or determined by its relationship with other variables within the model. In other words, an endogenous variable is synonymous with a dependent variable, meaning it correlates with other factors within the system being studied.

Why is income endogenous?

Income. In economic or statistical models that include income, it is considered to be an endogenous variable. For example, changes in income are dependent on variables such as changes in consumer expenditure or changes in consumers' buying power.


2 Answers

While defining the model, remove the parameter (enforce_stationarity = False ) and it should work fine !

like image 173
Shantanu Arya Avatar answered Sep 24 '22 15:09

Shantanu Arya


For New people what Shantanu mean is - Instead of this -

mod = sm.tsa.statespace.SARIMAX(y,
                            enter code hereorder=(1, 1, 1),
                            seasonal_order=(1, 1, 0, 12),
                            enforce_stationarity=False,
                            enforce_invertibility=False)

You can write - enforce_stationarity=False, - this is removed, you can comment it as well - #enforce_stationarity=False,

mod = sm.tsa.statespace.SARIMAX(y, order=(1, 1, 1),
                            seasonal_order=(1, 1, 0, 12),
                            enforce_invertibility=False)
like image 30
Amitkumar Sawant Avatar answered Sep 23 '22 15:09

Amitkumar Sawant