So following the example from the Matching package and in particular the GenMatch example. This continues on from a previous question
Link to R package here
Following the example in GenMatch
library(Matching)
data(lalonde)
attach(lalonde)
X = cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74)
BalanceMat <- cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74,
I(re74*re75))
genout <- GenMatch(Tr=treat, X=X, BalanceMatrix=BalanceMat, estimand="ATE", M=1,
pop.size=16, max.generations=10, wait.generations=1)
genout$matches
genout$ecaliper
Y=re78/1000
mout <- Match(Y=Y, Tr=treat, X=X, Weight.matrix=genout)
summary(mout)
We see 185 treated observation are paired with 270 non-treatment observation.
We can generate a table with the treatment cases and their age on the left and the control case and age on the right by:
pairs <- data.frame(mout$index.treated, lalonde$age[mout$index.treated], mout$index.control, lalonde$age[mout$index.control])
Now, the literature about the Weight.Matrix
generated from GenMatch
is very cryptic and doesn't explain what these values represent. I have an open question here. Now lets say we want to relax the matching so that more flexible pairing on the age criteria occurs.
We see that sd(lalonde$age)
gives us a SD of 7 years for our data.
So I want the Weight.matrix
to account for this. I want to use a limit of 1 SD for the age
variable and thus return more pairs then the original 185-270.
My guess is to generate a second GenMatch
function then continue with my code. So I use:
genout <- GenMatch(Tr=treat, X=X, BalanceMatrix=BalanceMat, estimand="ATE",
pop.size=1000, max.generations=10, wait.generations=1,
caliper=c(2,1,1,1,1,1,1,1,1,1))
But this does not significantly increase the number of pairs I return.
Any hints or solutions where I am going wrong
As Nick Kennedy describes:
summary(as.logical(lalonde$treat))
Mode FALSE TRUE NA's
logical 260 185 0
GenMatch will only match M
times for each treated case. It can potentially drop treated cases, and usually drops control cases since many don't match, but it can't generate new treated cases out of thin air: that is what multiple imputation is for ;-)
If you mean, generating more matches per treated case this is achieved with the M
argument, but caution is needed, especially when the number of controls is so close to the number of treated cases, as in the lalonde
data, since it has already found the best match, and adding additional matches is unlikely to improve matters, and often worsens them. This is best when the number of controls >> number of treated.
You can reconstruct each 'pair' of matches when M > 1
from the output data, if that is what you would like, and this will give a number of rows greater than the 185 in the treatment group, but of course with duplicates.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With