I have a data frame with time series data, called rData. The data is distributed into quarters and there is four years of data available. I analyzed the data and fitted an ARIMA model to the series, now I can compute forecasting for the periods to follow. But I wish to create a new column in my data frame that displays the forecast value that corresponds to the available time stamp. Then I wish to plot the two graphs against each other in R. Is their a way to compute these forecast values in R without individually analyzing all of the data prior to the available time stamp. Also how many cycles of data is necessary before forecasting can be computed?
Date <- seq(as.Date("2000-01-01"), as.Date("2003-12-31"), by = "quarter")
Sales <- c(2.8,2.1,4,4.5,3.8,3.2,4.8,5.4,4,3.6,5.5,5.8,4.3,3.9,6,6.4)
rData <- data.frame(Date, Sales)
tsData <- ts(data = rData$Sales, start = c(2000, 1), frequency = 4)
> tsExcelData
Qtr1 Qtr2 Qtr3 Qtr4
2000 2.8 2.1 4.0 4.5
2001 3.8 3.2 4.8 5.4
2002 4.0 3.6 5.5 5.8
2003 4.3 3.9 6.0 6.4
myModel <- auto.arima(tsData)
myForcast <- forecast(myModel, level = 95, h = 8)
The end result should be a data frame with an additional column and a graph with to plots, one for the actual data and one for the forecast data. Something like this.
Actual Data vs Forecast Data:

did you mean something like this, for the past values? If so just add this to your code:
extract_fitted_values <- myModel$fitted
plot(tsData, xlab = "Time", ylab = "Sales", type = "b", pch = 19)
lines(extract_fitted_values, col = "red")
As you see, you can extract the fitted values from the model fit.
Regarding your question: the data prior the time for the forecast IS actually analyzed when you run the auto.arima model.
That is how the Arima model estimates the parameters (by using past data) and then proceeds to do the forecasts. It is just that with the auto-arima function it (in addition) chooses the model specification automatically.
So basically the prior data analysis is a pre-requisite for the subsequent forecasts. It is worth noting that the red line that you see here represents the fitted values, i.e. your model is using all the data-points up to the last time point to calculate them and produce the numbers.
Maybe see more here if that point is a bit unclear: https://stats.stackexchange.com/questions/260899/what-is-difference-between-in-sample-and-out-of-sample-forecasts
If you wanted to do "out-of-sample" forecasts for the past data (2000-2004) then this is also possible, but you would just need to fit, say on 2000-2002, produce a forecast for 1 step, then roll 1 quarter forward and repeat the same etc. etc.
If you want them into a data.frame and plot the real values vs the fitted + the predicted, you can try this:
df <- data.frame( # your data and some NAs, for the forecasting
real = c(tsData, rep(NA,length(data.frame(myForcast)$Point.Forecast )))
# in a vector the fitted and the predicted
, pred = c(myModel$fitted, data.frame(myForcast)$Point.Forecast)
# the time for the plot
, time = c(time(tsData), seq(2004,2005.75, by = 0.25)
))
plot(df$real, xlab = "time", ylab = "real black, pred red", type = "b", pch = 19,xaxt="n")
lines(df$pred, col = "red")
axis(1, at=1:24, labels=df$time)

For the theory part, as already said, the fitted values are calculated when you run your model. Running the model is the base for the forecasting, but you can have the fitted without forecasting of course.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With