Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oaxaca Decomposition in R [closed]

Tags:

I would like to make a Oaxaca Decomposition in R. It is used in e.g. labor economics to distinguish explained variance versus unexplained variance, I believe. I have not been able to find a suitable solution in R, and I am rather reluctant to create one myself (I would probably mess it up).

Anyway, the procedure is briefly explained here:

http://en.wikipedia.org/wiki/Ronald_Oaxaca

Stata is blessed with a rather good package for this, but Stata is not easily available to me.

www.stata.com/meeting/5german/SINNING_stata_presentation.pdf

Please note: I have also posted a message on R-help but it has gotten no reply. I hope it is okay to post on this list as well.

Thanks in advance, Rasmus

Edit: I have made the following function, which seems to yield wrong answers (urgh). I tried to follow the Stata link above but it did not work out as I hoped :)

oaxaca <- function (fsex,frace1,frace2) {
  ## First we make regresions
  data1 <- subset(l2,sex==fsex & race==frace1)
  data2 <- subset(l2,sex==fsex & race==frace2)

  mindata1 <- subset(cbind(grade,exp,I(exp^2)),sex==fsex & race==frace1)
  mindata2 <- subset(cbind(grade,exp,I(exp^2)),sex==fsex & race==frace2)

  reg1 <- lm(log(wage)~grade+exp+I(exp^2), data=data1)
  reg2 <- lm(log(wage)~grade+exp+I(exp^2), data=data2)

  ## DECOMPOSITION
  ################

  ## Variables
  gap <- mean(log(wage[race==frace1 & sex==fsex]))-mean(log(wage[race==frace2 & sex==fsex]))

  mean1 <- colMeans(mindata1)
  mean2 <- colMeans(mindata2)

  beta1 <- summary(reg1)$coefficients[,1]
  beta2 <- summary(reg2)$coefficients[,1]
  beta1incep <- summary(reg1)$coefficients[1,1]
  beta2incep <- summary(reg2)$coefficients[1,1]
  beta1coef <- summary(reg1)$coefficients[c(2,3,4),1]
  beta2coef <- summary(reg2)$coefficients[c(2,3,4),1]
  betastar <- .5*(beta1coef+beta2coef)
  betastar2 <- (beta1+beta2)/2

  expl <- sum((mean1-mean2)*beta1coef)
  uexpl <- sum(mean2*(beta2coef-beta1coef))

  pct=expl/gap
  pct2=uexpl/gap

  ## output
  out <- data.frame(Gap=gap,
         Explained=expl,
         Unexplained=uexpl,
         Pct=pct*100)

  return(out)
 }