I am running coarsened exact matching (CEM) via the package MatchIt as a pre-processing step and want to use the matched data in further analyses. As a test, I ran CEM using the package cem, and noticed that the imbalance measure differed from the one via the MatchIt package. For example, using the LaLonde dataset:
library(MatchIt)
library(cem)
data(LL)
re74cut <- seq(0, 40000, 5000)
re75cut <- seq(0, max(LL$re75)+1000, by=1000)
agecut <- c(20.5, 25.5, 30.5,35.5,40.5)
my.cutpoints <- list(re75=re75cut, re74=re74cut, age=agecut)
matchit.match <- matchit(treated ~ age + education + black + married + nodegree +
re74 + re75 + hispanic + u74 + u75,
data = LL,
method = "cem",
cutpoints = my.cutpoints)
matchit.data <- match.data(matchit.match)
matchit.imb <- imbalance(group=matchit.data$treated,
data=matchit.data,
drop=c("treated","re78","distance",
"weights","subclass"))
cem.match <- cem(treatment = "treated",
data = LL, drop = "re78",
cutpoints = my.cutpoints,
eval.imbalance = TRUE)
matchit.imb
cem.match$imbalance
Does anybody know what is going on here? Thank you for any help.
There are two reasons. First, you must supply the weights from the matchit object to imbalance(). If you include these, the (diff) statistics will be correct, but the L1 statistic will still be wrong.
Second, by using matchit.data instead of LL in the call to imbalance(), the breaks for the L1 statistics are applied using only the matched data instead of the full dataset, which yields a different calculation of the L1 statistic. To correct this, in the call to imbalance(), you should supply the original, not matched, dataset, and using the matching weights to provide information on the matches. So, your final call to imbalance() should look like the following:
imbalance(LL$treated,
data=LL,
drop=c("treated", "re78"),
weights=matchit.match$weights)
That will produce the same results as cem.match$imbalance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With