Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using standard deviations in GenMatch to encourage more pairs

So following the example from the Matching package and in particular the GenMatch example. This continues on from a previous question

Link to R package here

Following the example in GenMatch

library(Matching)
data(lalonde)
attach(lalonde)

X = cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74)

BalanceMat <- cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74,
                    I(re74*re75))

genout <- GenMatch(Tr=treat, X=X, BalanceMatrix=BalanceMat, estimand="ATE", M=1,
                   pop.size=16, max.generations=10, wait.generations=1)

genout$matches
genout$ecaliper

Y=re78/1000

mout <- Match(Y=Y, Tr=treat, X=X, Weight.matrix=genout)
summary(mout)

We see 185 treated observation are paired with 270 non-treatment observation.

We can generate a table with the treatment cases and their age on the left and the control case and age on the right by:

pairs <- data.frame(mout$index.treated, lalonde$age[mout$index.treated], mout$index.control, lalonde$age[mout$index.control])

Now, the literature about the Weight.Matrix generated from GenMatch is very cryptic and doesn't explain what these values represent. I have an open question here. Now lets say we want to relax the matching so that more flexible pairing on the age criteria occurs.

We see that sd(lalonde$age) gives us a SD of 7 years for our data.

So I want the Weight.matrix to account for this. I want to use a limit of 1 SD for the age variable and thus return more pairs then the original 185-270.

My guess is to generate a second GenMatch function then continue with my code. So I use:

genout <- GenMatch(Tr=treat, X=X, BalanceMatrix=BalanceMat, estimand="ATE",
                   pop.size=1000, max.generations=10, wait.generations=1,
                   caliper=c(2,1,1,1,1,1,1,1,1,1))

But this does not significantly increase the number of pairs I return.

Any hints or solutions where I am going wrong

like image 368
lukeg Avatar asked Jun 04 '15 14:06

lukeg


1 Answers

As Nick Kennedy describes:

summary(as.logical(lalonde$treat))
   Mode   FALSE    TRUE    NA's 
logical     260     185       0 

GenMatch will only match M times for each treated case. It can potentially drop treated cases, and usually drops control cases since many don't match, but it can't generate new treated cases out of thin air: that is what multiple imputation is for ;-)

If you mean, generating more matches per treated case this is achieved with the M argument, but caution is needed, especially when the number of controls is so close to the number of treated cases, as in the lalonde data, since it has already found the best match, and adding additional matches is unlikely to improve matters, and often worsens them. This is best when the number of controls >> number of treated.

You can reconstruct each 'pair' of matches when M > 1 from the output data, if that is what you would like, and this will give a number of rows greater than the 185 in the treatment group, but of course with duplicates.

like image 182
Jack Wasey Avatar answered Oct 13 '22 23:10

Jack Wasey