Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R packages cem and MatchIt: Different imbalance measure

Tags:

matching

r

I am running coarsened exact matching (CEM) via the package MatchIt as a pre-processing step and want to use the matched data in further analyses. As a test, I ran CEM using the package cem, and noticed that the imbalance measure differed from the one via the MatchIt package. For example, using the LaLonde dataset:

library(MatchIt)
library(cem)
data(LL)

re74cut <- seq(0, 40000, 5000)
re75cut <- seq(0, max(LL$re75)+1000, by=1000)
agecut <- c(20.5, 25.5, 30.5,35.5,40.5)
my.cutpoints <- list(re75=re75cut, re74=re74cut, age=agecut)

matchit.match <- matchit(treated ~ age + education + black + married + nodegree + 
                           re74 + re75 + hispanic + u74 + u75,
                         data = LL,
                         method = "cem",
                         cutpoints = my.cutpoints)

matchit.data <- match.data(matchit.match)

matchit.imb <- imbalance(group=matchit.data$treated,
                         data=matchit.data,
                         drop=c("treated","re78","distance",
                                "weights","subclass"))

cem.match <- cem(treatment = "treated",
                 data = LL, drop = "re78",
                 cutpoints = my.cutpoints, 
                 eval.imbalance = TRUE)

matchit.imb
cem.match$imbalance

Does anybody know what is going on here? Thank you for any help.

like image 949
maaas Avatar asked Mar 17 '20 15:03

maaas


1 Answers

There are two reasons. First, you must supply the weights from the matchit object to imbalance(). If you include these, the (diff) statistics will be correct, but the L1 statistic will still be wrong.

Second, by using matchit.data instead of LL in the call to imbalance(), the breaks for the L1 statistics are applied using only the matched data instead of the full dataset, which yields a different calculation of the L1 statistic. To correct this, in the call to imbalance(), you should supply the original, not matched, dataset, and using the matching weights to provide information on the matches. So, your final call to imbalance() should look like the following:

imbalance(LL$treated, 
          data=LL, 
          drop=c("treated", "re78"), 
          weights=matchit.match$weights)

That will produce the same results as cem.match$imbalance.

like image 113
Noah Avatar answered Nov 17 '22 18:11

Noah