Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum likelihood in R

Tags:

r

statistics

I am new both to R and statistics. I am playing with maximum likelihood estimation, and I am getting some incorrect results. I want to model x with a simple linear function:

x<-apply(matrix(seq(1,10,1), nrow=1), 1, function(x) 10*x+runif(10,-3,3))
LL<-function(a,b){
    R=apply(x,1,function(y) a*y+b)
    -sum(log(R))
    }
mle(LL, start=list(a=10, b=0))

I am getting the following result:

Coefficients:
    a         b 
43571.957  1338.345 

instead of a~10, b~0.

I modified the code according to the suggestions of Spacedman:

set.seed(99)
x<-apply(matrix(seq(1,10,1), nrow=1), 1, function(x) 10*x+runif(10,-3,3))
LL<-function(a,b){
R = x[,1] - a*(1:10) + b
-sum(R^2)
}
library(stats4)
mle(LL, start=list(a=11, b=0.3))

Error in solve.default(oout$hessian) : 
Lapack routine dgesv: system is exactly singular: U[1,1] = 0

I do not know how to get rid of this error. Changing the sees and generating the x values again does not help.

like image 407
robert Avatar asked Jan 09 '23 15:01

robert


1 Answers

There are a couple of things to notice here. To clarify we start by changing the distribution of the error-term from a uniform distribution runif(x, -3, 3) to the std. normal distribution: rnorm(x). We can now easily simulate your data, then set up your (minus) loglikelihood and maximize (minizime) by:

a <- 10 
b <- 0
set.seed(99)
x <- apply(matrix(seq(1, 10, 1), nrow=1), 1, function(x) b + a * x + rnorm(10))
minuslogL <- function(a, b) -sum(dnorm(x[, 1] - (b + a * 1:10), log = TRUE))
library(stats4)
mle(minuslogL, start = list(a = 11, b = 0.3))

Call:
mle(minuslogl = minuslogL, start = list(a = 11, b = 0.3))

Coefficients:
        a         b 
9.8732793 0.5922192 

Notice that this works well, since the likelihood is smooth and mle() uses "BFGS" for the optimization, eg. a quasi-Newton, gradient approach. Lets try the same with uniform errors:

set.seed(99)
x <- apply(matrix(seq(1, 10, 1), nrow=1), 1, function(x) b + a * x + runif(10, -3, 3))
minuslogL2 <- function(a,b) -sum(dunif(x[, 1] -(a * 1:10 + b), -3, 3, log = TRUE))
mle(minuslogL2, start = list(a = 11, b = 0.3))

Error in optim(start, f, method = method, hessian = TRUE, ...) : 
  initial value in 'vmmin' is not finite

This fails! Why? Since the uniform-errors restrict the parameter space, you will not get a smooth likelihood. If you move your parameters a,b too far away from the true values, you will get Inf. If you move close enough, you will get the same likelihood (eg. many possible min. values):

> minuslogL2(11, 0.3)
[1] Inf
> minuslogL2(10, 0)
[1] 17.91759
> minuslogL2(10.02, 0.06)
[1] 17.91759

Maximizing this likelihood compares to finding the set: {a,b}: -logL(a, b) == -logL(10, 0), which can be found by a plain search algorithm.

like image 164
J.R. Avatar answered Jan 12 '23 09:01

J.R.