my dataset has the following 3 columns:
date client_id sales
01/01/2012 client 1 $1000
02/01/2012 client 1 $900
...
...
12/01/2014 client 1 $1000
01/01/2012 client 2 $300
02/01/2012 client 2 $450
...
..
12/01/2014 client 2 $375
and so on for 98 other clients (24 monthly datapoints for each client)
I have multiple clients (around 100)...data is in time series format for each client (24 monthly datapoints)
how do I automatically forecast sales for all the 100 clients using auto.arima in R? is there a by statement option? or do i have to use loops?
Thanks
You can always use lapply()
:
lapply(tsMat, function(x) forecast(auto.arima(x)))
A little example follows:
library(forecast)
#generate some time-series:
sales <- replicate(100,
arima.sim(n = 24, list(ar = c(0.8), ma = c(-0.2)), sd = sqrt(0.1))
)
dates <- seq(as.Date("2012/1/1"), by = "month", length.out=24)
df <- data.frame(date=rep(dates,100), client_id=rep(1:100,each=24), sales=c(sales))
#reshape and convert it to a proper time-series format like ts:
tsMat <- ts(reshape2::dcast(df, date~client_id), start=2012, freq=12)
#forecast by auto.arima:
output <- lapply(tsMat, function(x) forecast(auto.arima(x)))
You can also specify the number to forecast in the future by using 'h=#ofPeriods' in the forecast call
Forecast.allStates <- as.data.frame(lapply(ts.allStates, function(x) forecast(auto.arima(x),h=67)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With