Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

coda gelman.diag(): "Error in chol.default(W) : the leading minor of order nn is not positive definite"

Tags:

r

mcmc

The gelman.diag() function in the coda package throws an error when calculating the multivariate potential scale reduction factor (MPSRF).

> load("short_mcmc_list.rda")
> niter(short.mcmc.list)
[1] 100
> nvar(short.mcmc.list)
[1] 200
> nchain(short.mcmc.list)
[1] 2
> 
> coda::gelman.diag(
    short.mcmc.list, 
    autoburnin = FALSE,
    multivariate = TRUE
)

Error in chol.default(W) : the leading minor of order 199 is not positive definite

What does this error mean?

This question was previously posted at R coda "The leading minor of order 3 is not positive definite". The main conclusion is: "Conclusion: There seems to be something wrong with getting the multivariate estimate of the Gelman-Rubin diagnostic. Setting multivariate = FALSE fixes the problem and outputs a single variable estimate for each variable." It is 6 years old so the answers may be outdated.

like image 790
phargart Avatar asked Oct 29 '25 10:10

phargart


1 Answers

In gelman.diag(), the MPSRF is calculated by:

> coda::gelman.diag <-
function (x, confidence = 0.95, transform = FALSE, autoburnin = FALSE,
        multivariate = TRUE)
{
#A lot of code ...

Niter <- niter(x)
Nchain <- nchain(x)
Nvar <- nvar(x)
#W is the variance matrix of the chain
S2 <- array(sapply(x, var, simplify = TRUE), dim = c(Nvar,
                                                   Nvar, Nchain))
W <- apply(S2, c(1, 2), mean)
#A lot of code ...

if (Nvar > 1 && multivariate) {
CW <- chol(W)
#This is max eigenvalue from W^-1*B.
emax <- eigen(
  backsolve(CW, t(backsolve(CW, B, transpose = TRUE)), transpose = TRUE),
  symmetric = TRUE, only.values = TRUE)$values[1]
}

I edited the gelman.diag() by removing the Cholesky decomposition that gave the error, and by adding W and B to the list to be returned. This allows me to see why W cannot undergo Cholesky decomposition.

my.gelman.diag <- function(x,
                           confidence = 0.95,
                           transform = FALSE, 
                           autoburnin = FALSE,
                           multivariate = TRUE
){
  x <- as.mcmc.list(x)
  if (nchain(x) < 2)
    stop("You need at least two chains")
  if (autoburnin && start(x) < end(x)/2)
    x <- window(x, start = end(x)/2 + 1)
  Niter <- niter(x)
  Nchain <- nchain(x)
  Nvar <- nvar(x)
  xnames <- varnames(x)
  if (transform)
    x <- gelman.transform(x)
  x <- lapply(x, as.matrix)
  S2 <- array(sapply(x, var, simplify = TRUE),
              dim = c(Nvar, Nvar, Nchain)
  )
  W <- apply(S2, c(1, 2), mean)
  xbar <- matrix(sapply(x, apply, 2, mean, simplify = TRUE),
                 nrow = Nvar, ncol = Nchain)
  B <- Niter * var(t(xbar))
  if (Nvar > 1 && multivariate) {  #ph-edits 
    # CW <- chol(W)
    #    #This is W^-1*B.
    # emax <- eigen(
    #  backsolve(CW, t(backsolve(CW, B, transpose = TRUE)), transpose = TRUE),
    # symmetric = TRUE, only.values = TRUE)$values[1]
    emax <- 1
    mpsrf <- sqrt((1 - 1/Niter) + (1 + 1/Nvar) * emax/Niter)
  }  else {
    mpsrf <- NULL
  }

  w <- diag(W)
  b <- diag(B)
  s2 <- matrix(apply(S2, 3, diag), nrow = Nvar, ncol = Nchain)
  muhat <- apply(xbar, 1, mean)
  var.w <- apply(s2, 1, var)/Nchain
  var.b <- (2 * b^2)/(Nchain - 1)
  cov.wb <- (Niter/Nchain) * diag(var(t(s2), t(xbar^2)) - 2 *
                                    muhat * var(t(s2), t(xbar)))
  V <- (Niter - 1) * w/Niter + (1 + 1/Nchain) * b/Niter
  var.V <- ((Niter - 1)^2 * var.w + (1 + 1/Nchain)^2 * var.b +
              2 * (Niter - 1) * (1 + 1/Nchain) * cov.wb)/Niter^2
  df.V <- (2 * V^2)/var.V
  df.adj <- (df.V + 3)/(df.V + 1)
  B.df <- Nchain - 1
  W.df <- (2 * w^2)/var.w
  R2.fixed <- (Niter - 1)/Niter
  R2.random <- (1 + 1/Nchain) * (1/Niter) * (b/w)
  R2.estimate <- R2.fixed + R2.random
  R2.upper <- R2.fixed + qf((1 + confidence)/2, B.df, W.df) *
    R2.random
  psrf <- cbind(sqrt(df.adj * R2.estimate), sqrt(df.adj * R2.upper))
  dimnames(psrf) <- list(xnames, c("Point est.", "Upper C.I."))
  out <- list(psrf = psrf, mpsrf = mpsrf, B = B, W = W) #added ph
  class(out) <- "gelman.diag"
  return( out )
}

Applying my.gelman.diag() to short.mcmc.list:

> l <- my.gelman.diag(short.mcmc.list, autoburnin = FALSE, multivariate = TRUE)
> W <- l$W  #Within-sequence variance
> B <- l$B  #Between-sequence variance

An investigation into W shows that W is indeed positive definite, but its eigen-values are near 0 and hence it fails.

> evals.W <- eigen(W, only.values = TRUE)$values
> min(evals.W)
[1] 1.980596e-16

Indeed, increasing the tolerance shows that W is indeed positive definite.

> matrixNormal::is.positive.definite(W, tol = 1e-18)
[1] TRUE

So in reality, W is near linear dependency...

> eval <- eigen(solve(W)%*%B, only.values = TRUE)$values[1]

Error in solve.default(W) : system is computationally singular: reciprocal condition number = 7.10718e-21

So in fact, removing the last two columns of W makes it now linear independent/positive definite. This indicates that there are correlated parameters within the chain, and the number of parameters can be reduced.

> evals.W[198]
[1] 9.579362e-05
> matrixNormal::is.positive.definite(W[1:198, 1:198])
[1] TRUE
> chol.W <- chol(W)

W is the within-variance of Markov chain, a measure of how each value in the state is different from the mean. If W is near singular, the variance is small, and thus the chain does not vary much. It is a slow-moving chain. The user should investigate this using trace plots, and possibly reducing the number of parameters. The chain may also be too short; if the chain is longer, the values within the chain may be different enough so that W is linearly independent.

To avoid the function from crashing, I suggest to use purrr::possibly() to substitute a missing value instead of throwing an archaic error.

>  pgd <- purrr::possibly(coda::gelman.diag, list(mpsrf = NA), quiet = FALSE)
> pgd(short.mcmc.list, autoburnin = FALSE, multivariate = TRUE)
Error: the leading minor of order 199 is not positive definite
[1] NA

See Brooks and Gelman, 1992 for more information.

Reference: Gelman, A and Rubin, DB (1992) Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-511.

like image 73
phargart Avatar answered Nov 01 '25 01:11

phargart



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!