I am trying to figure out how to plot the profile likelihood curve of a GLM parameter with 95% pLCI's on the same plot. The example I have been trying with is below. The plots I am getting are not the likelihood curves that I was expecting. The y-axis of the plots is tau and I would like that axis to be the likelihood so that I have a curve that maxes at the parameter estimate. I am not sure where I find those likelihood values? I may just be misinterpreting the theory behind this. Thanks for any help you can give.
Max
clotting <- data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
glm2<-glm(lot2 ~ log(u), data=clotting, family=Gamma)
prof<-profile(glm2)
plot(prof)
Regenerate your example:
clotting <- data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
glm2 <- glm(lot2 ~ log(u), data=clotting, family=Gamma)
The profile.glm
function actually lives in the MASS
package:
library(MASS)
prof<-profile(glm2)
In order to figure out what profile.glm
and plot.profile
are doing, see ?profile.glm
and ?plot.profile
. However, in order to dig into the profile
object it may also be useful to examine the code of MASS:::profile.glm
and MASS:::plot.profile
... basically, what these tell you is that profile
is returning the signed square root of the difference between the deviance and the minimum deviance, scaled by the dispersion parameter. The reason that this is done is so that the profile for a perfectly quadratic profile will appear as a straight line (it's much easier to detect deviations from a straight line than from a parabola by eye).
The other thing that may be useful to know is how the profile is stored. Basically, it's a list of data frames (one for each parameter profiled), except that the individual data frames are a little bit weird (containing one vector component and one matrix component).
> str(prof)
List of 2
$ (Intercept):'data.frame': 12 obs. of 3 variables:
..$ tau : num [1:12] -3.557 -2.836 -2.12 -1.409 -0.702 ...
..$ par.vals: num [1:12, 1:2] -0.0286 -0.0276 -0.0267 -0.0258 -0.0248 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "(Intercept)" "log(u)"
..$ dev : num [1:12] 0.00622 0.00753 0.00883 0.01012 0.0114 ...
$ log(u) :'data.frame': 12 obs. of 2 variables:
..$ tau : num [1:12] -3.516 -2.811 -2.106 -1.403 -0.701 ...
..$ par.vals: num [1:12, 1:2] -0.0195 -0.0204 -0.0213 -0.0222 -0.023 ...
.. ..- attr(*, "dimnames")=List of 2
It also contains attributes summary
and original.fit
that you can use to recover the dispersion and minimum deviance:
disp <- attr(prof,"summary")$dispersion
mindev <- attr(prof,"original.fit")$deviance
Now reverse the transformation for parameter 1:
dev1 <- prof[[1]]$tau^2
dev2 <- dev1*disp+mindev
Plot:
plot(prof[[1]][,1],dev2,type="b")
(This is the plot of the deviance. You can multiply by 0.5 to get the negative log-likelihood, or -0.5 to get the log-likelihood ...)
edit: some more general functions to transform the profile into a useful format for lattice
/ggplot
plotting ...
tmpf <- function(x,n) {
data.frame(par=n,tau=x$tau,
deviance=x$tau^2*disp+mindev,
x$par.vals,check.names=FALSE)
}
pp <- do.call(rbind,mapply(tmpf,prof,names(prof),SIMPLIFY=FALSE))
library(reshape2)
pp2 <- melt(pp,id.var=1:3)
pp3 <- subset(pp2,par==variable,select=-variable)
Now plot it with lattice:
library(lattice)
xyplot(deviance~value|par,type="b",data=pp3,
scales=list(x=list(relation="free")))
Or with ggplot2:
library(ggplot2)
ggplot(pp3,aes(value,deviance))+geom_line()+geom_point()+
facet_wrap(~par,scale="free_x")
FYI, for fun, I took the above and whipped it together into a single function using purrr::imap_dfr
as I couldn't find a package that implements the above.
get_profile_glm <- function(aglm){
prof <- MASS:::profile.glm(aglm)
disp <- attr(prof,"summary")$dispersion
purrr::imap_dfr(prof, .f = ~data.frame(par = .y,
deviance=.x$z^2*disp+aglm$deviance,
values = as.data.frame(.x$par.vals)[[.y]],
stringsAsFactors = FALSE))
}
Works great!
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
ggplot(get_profile_glm(aglm), aes(x = values, y = deviance)) +
geom_point() +
geom_line() +
facet_wrap(~par, scale = "free_x")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With