Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to do multivariate multi-step forecasting using FB Prophet?

I'm working on a multivariate (100+ variables) multi-step (t1 to t30) forecasting problem where the time series frequency is every 1 minute. The problem requires to forecast one of the 100+ variables as target. I'm interested to know if it's possible to do it using FB Prophet's Python API. I was able to do it in a univariate fashion using only the target variable and the datetime variable. Any help and direction is appreciated. Please let me know if any further input or clarity is needed on the question.

like image 519
Abhishek Arora Avatar asked Feb 05 '19 22:02

Abhishek Arora


People also ask

Can prophet be used for multivariate analysis?

The answer to the original question is yes! Here is a link to specific Neural prophet documentation with several examples of how to use multivariate inputs.

Can we use Arima for multivariate analysis?

To deal with MTS, one of the most popular methods is Vector Auto Regressive Moving Average models (VARMA) that is a vector form of autoregressive integrated moving average (ARIMA) that can be used to examine the relationships among several variables in multivariate time series analysis.

How can we use prophets to forecast?

To use Prophet for forecasting, first, a Prophet() object is defined and configured, then it is fit on the dataset by calling the fit() function and passing the data. The Prophet() object takes arguments to configure the type of model you want, such as the type of growth, the type of seasonality, and more.

Is Sarimax a multivariate forecasting model?

1. SARIMAX Model. Seasonal Auto-Regressive Integrated Moving Average with eXogenous regressors, SARIMAX or Seasonal ARIMAX, is an extension of ARIMAX that explicitly supports multivariate time series data with a seasonal component.


3 Answers

You can add additional variables in Prophet using the add_regressor method.

For example if we want to predict variable y using also the values of the additional variables add1 and add2.

Let's first create a sample df:

import pandas as pd
df = pd.DataFrame(pd.date_range(start="2019-09-01", end="2019-09-30", freq='D', name='ds'))
df["y"] = range(1,31)
df["add1"] = range(101,131)
df["add2"] = range(201,231)
df.head()
            ds  y   add1 add2
0   2019-09-01  1   101 201
1   2019-09-02  2   102 202
2   2019-09-03  3   103 203
3   2019-09-04  4   104 204
4   2019-09-05  5   105 205

and split train and test:

df_train = df.loc[df["ds"]<"2019-09-21"]
df_test  = df.loc[df["ds"]>="2019-09-21"]

Before training the forecaster, we can add regressors that use the additional variables. Here the argument of add_regressor is the column name of the additional variable in the training df.

from fbprophet import Prophet
m = Prophet()
m.add_regressor('add1')
m.add_regressor('add2')
m.fit(df_train)

The predict method will then use the additional variables to forecast:

forecast = m.predict(df_test.drop(columns="y"))

Note that the additional variables should have values for your future (test) data. If you don't have them, you could start by predicting add1 and add2 with univariate timeseries, and then predict y with add_regressor and the predicted add1 and add2 as future values of the additional variables.

From the documentation I understand that the forecast of y for t+1 will only use the values of add1 and add2 at t+1, and not their values at t, t-1, ..., t-n as it does with y. If that is important for you, you could create new additional variables with the lags.

See also this notebook, with an example of using weather factors as extra regressors in a forecast of bicycle usage.

like image 140
queise Avatar answered Oct 16 '22 17:10

queise


I am confused, it seems like there is no agreement if Prophet works in multivariate way, see the github issues here and here. Judging by some comments, queise's answer and a nice youtube tutorial you can somehow make a work around to multivariate functionality, see the video here: https://www.youtube.com/watch?v=XZhPO043lqU

like image 22
NeStack Avatar answered Oct 16 '22 16:10

NeStack


To do forecasting for more than one dependent variable you need to implement that time series using Vector Auto Regression.

In VAR model, each variable is a linear function of the past values of itself and the past values of all the other variables.

for more information on VAR go to https://www.analyticsvidhya.com/blog/2018/09/multivariate-time-series-guide-forecasting-modeling-python-codes/

like image 4
Zeeshan Avatar answered Oct 16 '22 16:10

Zeeshan