I would like to calculate a Heckman selection model manually in R. My problem is that the standard errors are biased. Is there a way to correct these manually as well?
Below my (sample) code from the sampleSelection model (correct SEs), and the manual code (correct Estimates, wrong SEs)
require(sampleSelection)
data( Mroz87 )
Mroz87$kids <- ( Mroz87$kids5 + Mroz87$kids618 > 0 )
heckman <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper + I(exper^2) + educ + city,
data = Mroz87, method = "2step")
summary(heckman)
seleqn1 <- glm(lfp ~ age + I(age^2) + faminc + kids + educ, family=binomial(link="probit"), data=Mroz87)
summary(seleqn1)
# Calculate inverse Mills ratio by hand ##
Mroz87$IMR <- dnorm(seleqn1$linear.predictors)/pnorm(seleqn1$linear.predictors)
# Outcome equation correcting for selection ## ==> correct estimates, wrong SEs
outeqn1 <- lm(wage ~ exper + I(exper^2) + educ + city + IMR, data=Mroz87, subset=(lfp==1))
summary(outeqn1)
Heckman's (1974, 1978, 1979) sample selection model was developed using an econometric framework for handling limited dependent variables. It was designed to address the problem of estimating the average wage of women using data collected from a population of women in which housewives were excluded by self-selection.
Sample Selection Models in R: Package sampleSelection. • Random experiment, the situation where the participants do not have control over their. status but the researcher does. Randomisation is often the best possible method as it is easy to analyse and understand.
The Inverse Mills Ratio (IMR) is defined as the ratio of the standard normal density, ϕ, divided by the standard normal cumulative distribution function, Φ: IMR(x)=ϕ(x)Φ(x),x∈R.
myprobit <- probit(lfp ~ age + I(age^2) + faminc + kids + educ - 1, x = TRUE,
iterlim = 30, data=Mroz87)
imrData <- invMillsRatio(myprobit) # same as yours in this particular case
Mroz87$IMR1 <- imrData$IMR1
outeqn1 <- lm(wage ~ -1 + exper + I(exper^2) + educ + city + IMR1,
data=Mroz87, subset=(lfp==1))
The main thing was that you use intercept models instead of no-intercept.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With