Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Include ID variable in imputed data frame

I'm using library(mice) to impute missing data. I want a way to tell mice that the ID variables should be included on the imputed data set but not used for the imputations.

For instance

#making a silly data frame with missing data
library(tidyverse)
library(magrittr)
library(mice)

d1 <- data.frame(
  id = str_c(
    letters[1:20] %>% 
      rep(each = 5),
    1:5 %>% 
      rep(times  = 20)
    ),
  v1 = runif(100),
  v2 = runif(100),
  v3 = runif(100)
  )

d1[, -1] %<>%
  map(
    function(i){

      i[extract(sample(1:100, 5, F))] <- NA

      i
      }
    )

This is the returned mids object

m1 <- d1 %>% 
  select(-id) %>% 
  mice

How can I include d1$id as a variable in in each of the imputed data frames?

like image 873
tomw Avatar asked Jun 28 '26 06:06

tomw


1 Answers

There are two ways. First, simply append id to the imputed datasets

d2 <- complete(m1,'long', include = T) # imputed datasets in long format (including the original)
d3 <- cbind(d1$id,d2) # as datasets are ordered simply cbind `id`
m2 <- as.mids(d3) # and transform back to mids object

This ensures that id has no role in the imputation process, but is a bit sloppy and prone to error. Another way is to simply remove it from the predictor matrix.

The 2011 manual by Van Buuren & Groothuis-Oudshoorn says: "The user can specify a custom predictorMatrix, thereby effectively regulating the number of predictors per variable. For example, suppose that bmi is considered irrelevant as a predictor. Setting all entries within the bmi column to zero effectively removes it from the predictor set ... will not use bmi as a predictor, but still impute it."

To do this

ini <- mice(d1,maxit=0) # dry run without iterations to get the predictor matrix

pred1 <- ini$predictorMatrix # this is your predictor matrix
pred1[,'id'] <- 0 # set all id column values to zero to exclude it as a predictor

m1 <-mice(d1, pred = pred1) # use the new matrix in mice

You can also prevent mice from imputing the variable, but as it contains no missing values this is not necessary (mice will skip it automatically).

like image 180
Niek Avatar answered Jun 30 '26 23:06

Niek



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!