Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximizing a likelihood function that contains pbivnorm

Tags:

r

maximization

I have a likelihood function that contains a bivariate normal CDF. I keep getting values close to one for the correlation, even when the true value is zero.

The R package sampleSelection maximizes a likelihood function that contains a bivaraite normal CDF (as in Van de Ven and Van Praag (1981)). I tried looking at the source code for the package, but couldn't find how they write the likelihood. For reference, Van de Ven and Van Praag's paper:

The Demand for Deductibles in Private Health Insurance: A Probit Model with Sample Selection.

The likelihood function is Equation (19), where H denotes the standard normal CDF and H_2 denotes the bivariate normal CDF.

My question:

  1. Can someone tell me how the likelihood function is written in the sampleSelection package? or

  2. Can someone tell me why I'm getting values of close to one for the correlation in the code below?

Here's the code that's keeping me up at night:

########################################################
#
#  Trying to code Van de Ven and Van Praag (1981)
#
#
########################################################
library(MASS)
library(pbivnorm)
library(mnormt)
library(maxLik)
library(sampleSelection)
set.seed(1)

# Sample size
full_sample <- 1000

# Parameters
rho      <- .1
beta0    <- 0
beta1    <- 1
gamma0   <- .2
gamma1   <- .5
gamma2   <- .5
varcovar <- matrix(c(1,rho,rho,1), nrow = 2, ncol = 2) 
# Vectors for storing values
   y <- rep(0,full_sample)
   s <- rep(0,full_sample)
outcome  <- rep(0,full_sample)
select   <- rep(0,full_sample)

#######################
# Simulate data
#######################
 x <- rnorm(full_sample)
 z <- rnorm(full_sample)

 errors <- mvrnorm(full_sample, rep(0,2), varcovar)
# note: 1st element for selection eq; 2nd outcome
 s <- gamma0 + gamma1*x + gamma2*z + errors[,1]
 y <- beta0 + beta1*x + errors[,2]

 for(i in 1:full_sample){
    if(s[i]>=0){
       select[i] <- 1
       if(y[i]>=0){
         outcome[i] <- 1
       }else{
         outcome[i] <- 0
        }
    }else{
       outcome[i] <- NA
       select[i] <- 0
    }

}
#######################################
# Writing the log likelihood
##
# Note: vega1= beta0,
#       vega2= beta1,
#       vega3= gamma0,
#       vega4= gamma1,
#       vega5= gamma2,
#       vega6= rho
#######################################
first.lf <- function(vega) {

# Transforming this parameter becuase
#   correlation is bounded between -1 aad 1
  corr <- tanh(vega[6])

# Set up vectors for writing log likelihood
   y0 <- 1-outcome
   for(i in 1:full_sample) {    
     if(is.na(y0[i])){ y0[i]<- 0}
     if(is.na(outcome[i])){ outcome[i]<- 0}
   }
   yt0 <- t(y0)
   yt1 <- t(outcome)
   missing <- 1 - select
   ytmiss <- t(missing)

# Terms in the selection and outcome equations
   A <- vega[3]+vega[4]*x+vega[5]*z
   B <- vega[1]+vega[2]*x
   term1 <- pbivnorm(A,B,corr)
   term0 <- pbivnorm(A,-B,corr)
   term_miss <- pnorm(-A)
   log_term1 <- log(term1)
   log_term0 <- log(term0)
   log_term_miss <- log(term_miss)
# The log likelihood
    logl <- sum( yt1%*%log_term1 + yt0%*%log_term0 + ytmiss%*%log_term_miss)
return(logl)
}

startv <- c(beta0,beta1,gamma0,gamma1,gamma2,rho)
# Maxmimizing my likelihood gives
maxLik(logLik = first.lf, start = startv, method="BFGS")
# tanh(7.28604) = 0.9999991, far from the true value of .1

# Using sampleSelection package for comparison
outcomeF<-factor(outcome)
selectEq <- select ~ x + z
outcomeEq <- outcomeF ~ x
selection( selectEq, outcomeEq)
# Notice the value of -0.2162 for rho compared to my 0.9999991
like image 571
StupidQuestionAsker Avatar asked Mar 18 '26 16:03

StupidQuestionAsker


1 Answers

It happens that there is a typo in the paper in equation (19). The terms from i = N_1 + 1 to N should have -rho rather than rho. Hence, using

term0 <- pbivnorm(A,-B,-corr)

gives

maxLik(logLik = first.lf, start = startv, method="BFGS")
# Maximum Likelihood estimation
# BFGS maximization, 40 iterations
# Return code 0: successful convergence 
# Log-Likelihood: -832.5119 (6 free parameter(s))
# Estimate(s): 0.3723783 0.9307454 0.1349979 0.4693686 0.4572421 -0.219618 

as needed.

like image 129
Julius Vainora Avatar answered Mar 20 '26 05:03

Julius Vainora



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!