Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

un-log a times series while using the package forecast

Tags:

r

time-series

Hello I use the package forecast in order to do times-series prevision. I would like to know how to un-log a series on the final forecast plot. With the forecast package I don't know how to un-log my series. Here is an example:

library(forecast)
data <- AirPassengers
data <- log(data) #with this AirPassengers data not nessesary to LOG but with my private data it is...because of some high picks...
ARIMA <- arima(data, order = c(1, 0, 1), list(order = c(12,0, 12), period = 1)) #Just a fake ARIMA in this case... 
plot(forecast(ARIMA, h=24)) #but my question is how to get a forecast plot according to the none log AirPassenger data

enter image description here

So the image is logged. I want to have the same ARIMA modell but witht the none loged data.

like image 887
S12000 Avatar asked Nov 29 '22 01:11

S12000


2 Answers

It is not necessary to use the hack proposed by @ndoogan. forecast.Arima has built-in facilities for undoing transformations. The following code will do what is required:

fc <- forecast(ARIMA, h=24, lambda=0)

Better still, build the transformation into the model itself:

ARIMA <- Arima(data, order=c(1,0,1), list(order=c(1,0,1),period=12)), lambda=0)
fc <- forecast(ARIMA, h=24)

Note that you need to use the Arima function from the forecast package to do this, not the arima function from the stats package.

@Hemmo is correct that this back-transformation will not give the mean of the forecast distribution, and so is not the optimal MSE forecast. However, it will give the median of the forecast distribution, and so will give the optimal MAE forecast.

Finally, the fake model used by @Swiss12000 makes little sense as the seasonal part has frequency 1, and so is confounded with the non-seasonal part. I think you probably meant the model I've used in the code above.

like image 143
Rob Hyndman Avatar answered Dec 15 '22 10:12

Rob Hyndman


The problem with @ndoogan's answer is that logarithm is not a linear transformation. Which means that E[exp(y)] != exp(E[y]). Jensen's inequality gives actually that E[exp(y)] >= exp(E[y]). Here's a simple demonstration:

set.seed(1)
x<-rnorm(1000)
mean(exp(x))
[1] 1.685356
exp(mean(x))
[1] 0.9884194

Here's a case concerning the prediction:

# Simulate AR(1) process
set.seed(1)
y<-10+arima.sim(model=list(ar=0.9),n=100)

# Fit on logarithmic scale
fit<-arima(log(y),c(1,0,0))

#Simulate one step ahead
set.seed(123)
y_101_log <- fit$coef[2]*(1-fit$coef[1]) + 
             fit$coef[1]*log(y[100]) + rnorm(n=1000,sd=sqrt(fit$sigma2))

y_101<-exp(y_101_log) #transform to natural scale

exp(mean(y_101_log)) # This is exp(E(log(y_101)))
[1] 5.86717          # Same as exp(predict(fit,n.ahead=1)$pred) 
                     # differs bit because simulation

mean(y_101)          # This is E(exp(log(y_101)))=E(y_101)
[1] 5.904633

# 95% Prediction intervals:

#Naive way:
pred<-predict(fit,n.ahead=1)
c(exp(pred$pred-1.96*pred$se),exp(pred$pred+1.96*pred$se))
pred$pred pred$pred 
 4.762880  7.268523 

# Correct ones:
quantile(y_101,probs=c(0.025,0.975))
    2.5%    97.5% 
4.772363 7.329826 

This also provides a solution to your problem in general sense:

  1. Fit your model
  2. Simulate multiple samples from that model (for example one step ahead predictions as above)
  3. For each simulated sample, make the inverse transformation to get the values in original scale
  4. From these simulated samples you can compute the expected value as a ordinary mean, or if you need confidence intervals, compute empirical quantiles.
like image 43
Jouni Helske Avatar answered Dec 15 '22 12:12

Jouni Helske