Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python ValueError: Start must be in dates. Got 2016-01-01 | 2016-01-01 00:00:00

I am using statsmodels.tsa.statespace.sarimax to make prediction. Here is my code

pprint(list(rolmean_df.index)[0])
>> datetime.date(2015, 5, 19)

mod = sm.tsa.statespace.SARIMAX(rolmean_df['IC_10_diff'], trend='n', order=(2,1,2))
results = mod.fit()
s = results.get_prediction(start = pd.to_datetime('2016-01-01').date(), dynamic= False)
>> ValueError: Start must be in dates. Got 2016-01-01 | 2016-01-01 00:00:00

I am not sure whether my "start" format is wrong or something else does not work.

like image 452
goldpiggy Avatar asked May 17 '17 18:05

goldpiggy


2 Answers

From the doc: http://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.html

  • get_forecast([steps]) Out-of-sample forecasts
  • get_prediction([start, end, dynamic, exog]) In-sample prediction and out-of-sample forecasting

so, "Start must be in dates" means start should be in your data set.

If you want to use the model to forecase , use get_forecast().

like image 119
BUPTGuo Avatar answered Nov 05 '22 05:11

BUPTGuo


I had a similar problem. You have to make sure that the start date is contained within rolmean_df['IC_10_diff'].

Just print all the values in `rolmean_df['IC_10_diff']' and pick the closest one to 2016-01-01.

like image 31
Quantum_Something Avatar answered Nov 05 '22 04:11

Quantum_Something