Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating function arguments from a named list (with an application to stats4::mle)

Tags:

r

I should start by saying what I'm trying to do: I want to use the mle function without having to re-write my log likelihood function each time I want to try a different model specification. Because mle is expecting a named list of starting values, you apparently cannot just write the log-likelihood function as taking a vector of parameters. A simple example:

Suppose I want to fit a linear regression model via maximum likelihood and at first, I'm ignoring one of my predictors:

n <- 100
df <- data.frame(x1 = runif(n), x2 = runif(n), y = runif(n))
Y <- df$y
X <- model.matrix(lm(y ~ x1, data = df))

# define log-likelihood function 
ll <- function(beta0, beta1, sigma){
  beta = matrix(NA, nrow=2, ncol=1)
  beta[,1] = c(beta0, beta1)
  -sum(log(dnorm(Y - X %*% beta, 0, sigma)))
}
library(stats4)
mle(ll, start = list(beta0=.1, beta1=.2, sigma=1)

Now, if I want to fit a different model, say:

m <- lm(y ~ x1 + x2, data = df)

I cannot re-use my log-likelihood function--I'd have to re-write it to have the beta3 parameter. What I'd like to do is something like:

ll.flex <- function(theta){
 # theta is a vector that I can use directly 
  ...
}

if I could then somehow adjust the start argument in mle to account for my now vector-input log-likelihood function, or barring that, have a function that constructs the log-likelihood function at run-time, say by constructing the named list of arguments and then using it to define the function e.g., something like this:

X <- model.matrix(lm(y ~ x1 + x2, data = df))
arguments <- rep(NA, dim(X)[2])
names(arguments) <- colnames(X)
ll.magic <- function(bring.this.to.life.as.function.arguments(arguments)){...} 

Update:

I ended up writing a helper function that can add an arbitrary number of named arguments x1, x2, x3... to a passed function f.

add.arguments <- function(f,n){
  # adds n arguments to a function f; returns that new function 
  t = paste("arg <- alist(",
  paste(sapply(1:n, function(i) paste("x",i, "=",sep="")), collapse=","),
  ")", sep="")
  formals(f) <- eval(parse(text=t))
  f
 }

It's ugly, but it got the job done, letting me re-factor my log-likelihood function on the fly.

like image 867
John Horton Avatar asked Feb 03 '23 13:02

John Horton


2 Answers

You can use the mle2 function from the package bbmle which allows you to pass vectors as parameters. Here is some sample code.

# REDEFINE LOG LIKELIHOOD
ll2 = function(params){
  beta = matrix(NA, nrow = length(params) - 1, ncol = 1)
  beta[,1] = params[-length(params)] 
  sigma    = params[[length(params)]]
  minusll  = -sum(log(dnorm(Y - X %*% beta, 0, sigma)))
  return(minusll)
}

# REGRESS Y ON X1
X <- model.matrix(lm(y ~ x1, data = df))
mle2(ll2, start = c(beta0 = 0.1, beta1 = 0.2, sigma = 1), 
  vecpar = TRUE, parnames = c('beta0', 'beta1', 'sigma'))

# REGRESS Y ON X1 + X2

X <- model.matrix(lm(y ~ x1 + x2, data = df))
mle2(ll2, start = c(beta0 = 0.1, beta1 = 0.2, beta2 = 0.1, sigma = 1), 
      vecpar = TRUE, parnames = c('beta0', 'beta1', 'beta2', 'sigma'))

This gives you

Call:
mle2(minuslogl = ll2, start = c(beta0 = 0.1, beta1 = 0.2, beta2 = 0.1, 
    sigma = 1), vecpar = TRUE, parnames = c("beta0", "beta1", 
    "beta2", "sigma"))

Coefficients:
     beta0      beta1      beta2      sigma 
 0.5526946 -0.2374106  0.1277266  0.2861055
like image 172
Ramnath Avatar answered Feb 05 '23 08:02

Ramnath


It might be easier to use optim directly; that's what mle is using anyway.

ll2 <- function(par, X, Y){
  beta <- matrix(c(par[-1]), ncol=1)
  -sum(log(dnorm(Y - X %*% beta, 0, par[1])))
}
getp <- function(X, sigma=1, beta=0.1) {
  p <- c(sigma, rep(beta, ncol(X)))
  names(p) <- c("sigma", paste("beta", 0:(ncol(X)-1), sep=""))
  p
}

set.seed(5)
n <- 100
df <- data.frame(x1 = runif(n), x2 = runif(n), y = runif(n))
Y <- df$y
X1 <- model.matrix(y ~ x1, data = df) 
X2 <- model.matrix(y ~ x1 + x2, data = df)
optim(getp(X1), ll2, X=X1, Y=Y)$par
optim(getp(X2), ll2, X=X2, Y=Y)$par

With the output of

> optim(getp(X1), ll2, X=X1, Y=Y)$par
      sigma       beta0       beta1 
 0.30506139  0.47607747 -0.04478441 
> optim(getp(X2), ll2, X=X2, Y=Y)$par
      sigma       beta0       beta1       beta2 
 0.30114079  0.39452726 -0.06418481  0.17950760 
like image 42
Aaron left Stack Overflow Avatar answered Feb 05 '23 08:02

Aaron left Stack Overflow