Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4)

Tags:

I am trying to run a multi-level model on multiply imputed data (created with Amelia); the sample is based on a clustered sample with group = 24, N= 150.

library("ZeligMultilevel")
ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed",
data=a.out$imputations)
summary(ML.model.0)

This code produces the following error code:

Error in object[[1]]$result$call : 
$ operator not defined for this S4 class

If I run a OLS regression, it works:

model.0 <- zelig(dv~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0)) 
print(m.0, digits = 2)

      Value Std. Error t-stat  p-value
[1,]    45       0.34    130 2.6e-285

I am happy to provide a working example.

require(Zelig)
require(Amelia)
require(ZeligMultilevel)

data(freetrade)
length(freetrade$country) #grouping variable

#Imputation of missing data

a.out <- amelia(freetrade, m=5, ts="year", cs="country")

# Models: (1) OLS; (2) multi-level 

model.0 <- zelig(polity~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0)) 
print(m.0, digits = 2)

ML.model.0 <- zelig(polity~1 + tag(1|country), model="ls.mixed", data=a.out$imputations)
summary(ML.model.0)

I think the issue may be with how Zelig interfaces with Amelia's mi class. Therefore, I turned toward an alternative R package: lme4.

require(lme4)
write.amelia(obj=a.out, file.stem="inmi", format="csv", na="NA")
diff <-list(5)  # a list to store each model, 5 is the number of the imputed datasets

for (i in 1:5) {
file.name <- paste("inmi", 5 ,".csv",sep="")
data.to.use <- read.csv(file.name)
diff[[5]] <- lmer(polity ~ 1 + (1 | country),
data = data.to.use)}
diff

The result is the following:

[[1]]
[1] 5

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
Linear mixed model fit by REML 
Formula: polity ~ 1 + (1 | country) 
   Data: data.to.use 
  AIC  BIC logLik deviance REMLdev
 1006 1015 -499.9     1002   999.9
Random effects:
 Groups   Name        Variance Std.Dev.
 country  (Intercept) 14.609   3.8222  
 Residual             17.839   4.2236  
Number of obs: 171, groups: country, 9

Fixed effects:
            Estimate Std. Error t value
(Intercept)    2.878      1.314    2.19

The results remain the same when I replace diff[[5]] by diff[[4]], diff[[3]] etc. Still, I am wondering whether this is actually the results for the combined dataset or for one single imputed data set. Any thoughts? Thanks!

893

asked May 15 '13 17:05

TiF

1 Answers

I modified the summary function for this object (fetched the source and opened up ./R/summary.R file). I added some curly braces to make the code flow and changed a getcoef to coef. This should work for this particular case, but I'm not sure if it's general. Function getcoef searches for slot coef3, and I have never seen this. Perhaps @BenBolker can throw an eye here? I can't guarantee this is what the result looks like, but the output looks legit to me. Perhaps you could contact the package authors to correct this in the future version.

summary(ML.model.0)

  Model: ls.mixed
  Number of multiply imputed data sets: 5 

Combined results:

Call:
zelig(formula = polity ~ 1 + tag(1 | country), model = "ls.mixed", 
    data = a.out$imputations)

Coefficients:
        Value Std. Error   t-stat    p-value
[1,] 2.902863   1.311427 2.213515 0.02686218

For combined results from datasets i to j, use summary(x, subset = i:j).
For separate results, use print(summary(x), subset = i:j).

Modified function:

summary.MI <- function (object, subset = NULL, ...) {
  if (length(object) == 0) {
    stop('Invalid input for "subset"')
  } else {
    if (length(object) == 1) {
      return(summary(object[[1]]))
    }
  }

  # Roman: This function isn't fecthing coefficients robustly. Something goes wrong. Contact package author. 
  getcoef <- function(obj) {
    # S4
    if (!isS4(obj)) {
      coef(obj)
    } else {
      if ("coef3" %in% slotNames(obj)) {
        obj@coef3
      } else {
        obj@coef
      }
    }
  }

    #
    res <- list()

    # Get indices
    subset <- if (is.null(subset)) {
      1:length(object)
    } else {
      c(subset)
    }

    # Compute the summary of all objects
    for (k in subset) {
      res[[k]] <- summary(object[[k]])
    }


    # Answer
    ans <- list(
      zelig = object[[1]]$name,
      call = object[[1]]$result@call,
      all = res
    )

    #
    coef1 <- se1 <- NULL

    #
    for (k in subset) {
#       tmp <-  getcoef(res[[k]]) # Roman: I changed this to coef, not 100% sure if the output is the same
      tmp <- coef(res[[k]])
      coef1 <- cbind(coef1, tmp[, 1])
      se1 <- cbind(se1, tmp[, 2])
    }

    rows <- nrow(coef1)
    Q <- apply(coef1, 1, mean)
    U <- apply(se1^2, 1, mean)
    B <- apply((coef1-Q)^2, 1, sum)/(length(subset)-1)
    var <- U+(1+1/length(subset))*B
    nu <- (length(subset)-1)*(1+U/((1+1/length(subset))*B))^2

    coef.table <- matrix(NA, nrow = rows, ncol = 4)
    dimnames(coef.table) <- list(rownames(coef1),
                                 c("Value", "Std. Error", "t-stat", "p-value"))
    coef.table[,1] <- Q
    coef.table[,2] <- sqrt(var)
    coef.table[,3] <- Q/sqrt(var)
    coef.table[,4] <- pt(abs(Q/sqrt(var)), df=nu, lower.tail=F)*2
    ans$coefficients <- coef.table
    ans$cov.scaled <- ans$cov.unscaled <- NULL

    for (i in 1:length(ans)) {
      if (is.numeric(ans[[i]]) && !names(ans)[i] %in% c("coefficients")) {
        tmp <- NULL
        for (j in subset) {
          r <- res[[j]]
          tmp <- cbind(tmp, r[[pmatch(names(ans)[i], names(res[[j]]))]])
        }
        ans[[i]] <- apply(tmp, 1, mean)
      }
    }

    class(ans) <- "summaryMI"
    ans
  }

163

answered Oct 15 '22 10:10

Roman Luštrik

Related questions
                            
                                How to avoid vector search in data.table
                            
                                inset \footnote{} into header with xtable and tabular.environment
                            
                                R does ggplot2 have interactivity option?
                            
                                Determining derivatives from GAM smooth object
                            
                                How to put square brackets and a subscript next to each other in R expression?
                            
                                Reshaping wide dataset in interval format
                            
                                Find all possible ways to split a list of elements into a a given number of group of the same size
                            
                                ggplot2/colorbrewer qualitative pallette with 125 categories
                            
                                Producing RNG vectors in R that have pre-defined sum of pdf or sum of cdf
                            
                                combining column number having same values
                            
                                Wrapping plots in another html container within an Rmd file
                            
                                How to set up a shortcut for different versions of R in RStudio?
                            
                                How to create design matrix in r
                            
                                Simpler way to reconstitute a melted data frame back to the original
                            
                                Interpolating data in R
                            
                                Formal argument "type" matched by multiple actual arguments
                            
                                Smallest size of GGplot2 geom_text()
                            
                                tableGrob: set the height and width of a grid.table
                            
                                How can I ensure that a partition has representative observations from each level of a factor?
                            
                                R Basket Analysis using arules package with unique order number but duplicate order combinations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Multi-level regression model on multiply imputed data set in R (Amelia, zelig, lme4)

Tags:

r

missing-data

regression

multi-level

r-zelig

TiF

People also ask

1 Answers

Roman Luštrik

Recent Activity

Donate For Us