For loop for forecasting several datasets at once in R

Question

I have a dataset with "Time, Region, Sales" variables and I want to forecast sales for each region using ARIMA or ETS(SES) using library(forecast). There are a total of 70 regions and all of them have 152 observations each and (3 years of data). Something like this:

  Week      Region    Sales 
01/1/2011      A       129
07/1/2011      A       140
14/1/2011      A       133
21/1/2011      A       189
...           ...      ...
01/12/2013     Z       324
07/12/2013     Z       210
14/12/2013     Z       155
21/12/2013     Z       386
28/12/2013     Z       266

So, I want R to treat every region as a different dataset and perform an auto.arima. I am guessing a for loop should be an ideal fit here but I miserably failed with it. What I would ideally want it to do is a for loop to run something like this (an auto arima for every 152 observations):

fit.A <- auto.arima(data$Sales[1:152])  
fit.B <- auto.arima(data$Sales[153:304])
....
fit.Z <- auto.arima(data$Sales[10490:10640])

I came across this but while converting the dataframe into timeseries, all I got is NAs.

Any help is appreciated! Thank you.

David Arenburg · Accepted Answer

Try the very efficient data.table package (assuming your data set called temp)

library(data.table)
library(forecast)
temp <- setDT(temp)[, list(AR = list(auto.arima(Sales))), by = Region]

The last step will save your results in temp in a list formats (as this is the only format you can store this type of an object).

Afterwords you can do any operation you want on these lists, for example, Inspecting them:

temp$AR
#[[1]]
# Series: Sales 
# ARIMA(0,0,0) with non-zero mean 
# 
# Coefficients:
#   intercept
# 147.7500
# s.e.    12.0697
# 
# sigma^2 estimated as 582.7:  log likelihood=-18.41
# AIC=40.82   AICc=52.82   BIC=39.59
#
#[[2]]
# Series: Sales 
# ARIMA(0,0,0) with non-zero mean 
# 
# Coefficients:
#   intercept
# 268.2000
# s.e.    36.4404
# 
# sigma^2 estimated as 6639:  log likelihood=-29.1
# AIC=62.19   AICc=68.19   BIC=61.41

Or plot the forecasts (and etc.)

temp[, sapply(AR, function(x) plot(forecast(x, 10)))]

ramhiser · Answer

You can do this easily with dplyr. Assuming your data frame is named df, run:

library(dplyr)
library(forecast)
model_fits <- group_by(df, Region) %>% do(fit=auto.arima(.$Sales))

The result is a data frame containing the model fits for each region:

> head(model_fits)
Source: local data frame [6 x 2]
Groups: <by row>

  Region        fit
1      A <S3:Arima>
2      B <S3:Arima>
3      C <S3:Arima>
4      D <S3:Arima>
5      E <S3:Arima>
6      F <S3:Arima>

You can get a list with each model fit like so:

> model_fits$fit
[[1]]
Series: .$Sales 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
      intercept
       196.0000
s.e.    14.4486

sigma^2 estimated as 2088:  log likelihood=-52.41
AIC=108.82   AICc=110.53   BIC=109.42

[[2]]
Series: .$Sales 
ARIMA(0,0,0) with non-zero mean 

Coefficients:
      intercept
       179.2000
s.e.    14.3561

sigma^2 estimated as 2061:  log likelihood=-52.34
AIC=108.69   AICc=110.4   BIC=109.29

For loop for forecasting several datasets at once in R

Tags:

for-loop

r

forecasting

Shraddha

2 Answers

David Arenburg

ramhiser

Recent Activity

Donate For Us

For loop for forecasting several datasets at once in R

Tags:

for-loop

r

forecasting

Shraddha

2 Answers

David Arenburg

ramhiser

Related questions

Recent Activity

Donate For Us