Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract p-value from checkresiduals function

Tags:

r

forecasting

I am forecasting with forecast package.Below is results from my forecasting.

#CODE 
library(forecast)
      DATA_SET<-data.frame(TEST=c(200,220,200,260,300,290,320,340,360,500,200,300,400,250,350,390,400,450,470,350,300,220,580,450,120,250,360,470)
                           )
      View(DATA_SET)

      # Making TS object
      TS_DATA_SET<-ts(DATA_SET,start=c(2010,1),frequency = 12)

      # Forecasting
      TS_FORECAST<-auto.arima(TS_DATA_SET)

So now I want to extract p-value from checkresiduals function into data frame,

   #Checking residuals
   checkresiduals(TS_FORECAST, plot = FALSE)

##  Ljung-Box test
##
##   data:  Residuals from ARIMA(0,0,0) with non-zero mean
## Q* = 4.5113, df = 4.6, p-value = 0.4237
##
## Model df: 1.   Total lags used: 5.6

I am trying with code below but I have problem

p-value<-data.frame(checkresiduals(TS_FORECAST, plot = FALSE))
p_value
#data frame with 0 columns and 0 rows

So can anybody help me how to extract p-value (p-value = 0.4237) from checkresiduals function into data.frame ?

like image 938
silent_hunter Avatar asked Apr 18 '19 08:04

silent_hunter


2 Answers

Edit:

My first method below got implemented on the 'checkresiduals()' function. Nowadays the function returns the output values by default.


old answer:

Unfortunately the function checkresiduals() doesn't return values, just prints() them. You can see the function by writing checkresiduals without brackets. Or you check the github of the developer.

You can rewrite the function by putting a return() in it. I just copy paste the function and insert it at the end:

 checkresiduals <- function(object, lag, df=NULL, test, plot=TRUE, ...) {
  showtest <- TRUE
  if (missing(test)) {
    if (is.element("lm", class(object))) {
      test <- "BG"
    } else {
      test <- "LB"
    }
    showtest <- TRUE
  }
  else if (test != FALSE) {
    test <- match.arg(test, c("LB", "BG"))
    showtest <- TRUE
  }
  else {
    showtest <- FALSE
  }

  # Extract residuals
  if (is.element("ts", class(object)) | is.element("numeric", class(object))) {
    residuals <- object
    object <- list(method = "Missing")
  }
  else {
    residuals <- residuals(object)
  }

  if (length(residuals) == 0L) {
    stop("No residuals found")
  }

  if ("ar" %in% class(object)) {
    method <- paste("AR(", object$order, ")", sep = "")
  } else if (!is.null(object$method)) {
    method <- object$method
  } else if ("HoltWinters" %in% class(object)) {
    method <- "HoltWinters"
  } else if ("StructTS" %in% class(object)) {
    method <- "StructTS"
  } else {
    method <- try(as.character(object), silent = TRUE)
    if ("try-error" %in% class(method)) {
      method <- "Missing"
    } else if (length(method) > 1 | base::nchar(method[1]) > 50) {
      method <- "Missing"
    }
  }
  if (method == "Missing") {
    main <- "Residuals"
  } else {
    main <- paste("Residuals from", method)
  }

  if (plot) {
    suppressWarnings(ggtsdisplay(residuals, plot.type = "histogram", main = main, ...))
  }

  # Check if we have the model
  if (is.element("forecast", class(object))) {
    object <- object$model
  }

  if (is.null(object) | !showtest) {
    return(invisible())
  }

  # Seasonality of data
  freq <- frequency(residuals)

  # Find model df
  if(grepl("STL \\+ ", method)){
    warning("The fitted degrees of freedom is based on the model used for the seasonally adjusted data.")
  }
  df <- modeldf(object)

  if (missing(lag)) {
    lag <- ifelse(freq > 1, 2 * freq, 10)
    lag <- min(lag, round(length(residuals)/5))
    lag <- max(df+3, lag)
  }

  if (!is.null(df)) {
    if (test == "BG") {
      # Do Breusch-Godfrey test
      BGtest <- lmtest::bgtest(object, order = lag)
      BGtest$data.name <- main
      print(BGtest)
      return(BGtest)
    }
    else {
      # Do Ljung-Box test
      LBtest <- Box.test(zoo::na.approx(residuals), fitdf = df, lag = lag, type = "Ljung")
      LBtest$method <- "Ljung-Box test"
      LBtest$data.name <- main
      names(LBtest$statistic) <- "Q*"
      print(LBtest)
      cat(paste("Model df: ", df, ".   Total lags used: ", lag, "\n\n", sep = ""))
      return(LBtest)
    }
  }

}

you also need the modeldf() function from the github file

modeldf <- function(object, ...){
  UseMethod("modeldf")
}

modeldf.Arima <- function(object, ...){
  length(object$coef)
}

With this solution you use your original checkresiduals function. Now you can call the p.value with:

res_values <- checkresiduals(TS_FORECAST, plot = TRUE)
res_values$p.value

You could also just use the Ljung-Box and Breusch-Godfrey test by yourself and ignore the checkresiduals() function, since this is what checkresiduals() does.

I thought editing the checkresiduals() function is a more handy way, so you can use it like you are used to it. You can paste it in your code and it should do the work. Just make sure you declare the modeldf() and modeldf().Arima before you call the function. Also it works either test the function does.


Option 2 because it's possible:

You could capture the output with capture.output()

capture.output(checkresiduals(TS_FORECAST, plot = FALSE))[5]

"Q* = 4.8322, df = 5, p-value = 0.4367"

With a grep command it should be possible to extract the p-value without changing the function. Since i'm not familiar with grep, i can't provide a proper answer on this task.

like image 157
mischva11 Avatar answered Nov 15 '22 09:11

mischva11


Here you can see the inside of checkresiduals().

Unfortunately, as per the docs, it doesn't return a value, so you can't easy extract what you need.

But you can do the same calculation (look at line 125 in the linked repo):

Box.test(zoo::na.approx(TS_FORECAST$residuals), type = "Ljung")

To access the p-value just use $p.value, after you assign the output to a variable.

Note that in my quick example it's a bit different because I used the default values:

# r$p.value
# [1] 0.3678976
like image 33
RLave Avatar answered Nov 15 '22 08:11

RLave