The muhaz
package estimates the hazard function from right censored data using kernel smoothing methods. My question is, is there any way to obtain confidence intervals for the hazard function that muhaz
calculates?
options(scipen=999)
library(muhaz)
data(ovarian, package="survival")
attach(ovarian)
fit1 <- muhaz(futime, fustat)
plot(fit1, lwd=3, ylim=c(0,0.002))
In the above example the muhaz.object
fit
has some entries fit1$msemin
, fit1$var.min
, fit1$haz.est
however their length is half of the fit1$haz.est
.
Any ideas if it is possible to extract confidence intervals for the hazard function?
EDIT: I tried the following based with what @user20650 suggested
options(scipen=999)
library(muhaz)
data(ovarian, package="survival")
fit1 <- muhaz(ovarian$futime, ovarian$fustat,min.time=0, max.time=744)
h.df<-data.frame(est=fit1$est.grid, h.orig=fit1$haz.est)
for (i in 1:10000){
d.s.onarian<-ovarian[sample(1:nrow(ovarian), nrow(ovarian), replace = T),]
d.s.muhaz<-muhaz(d.s.onarian$futime, d.s.onarian$fustat, min.time=0, max.time=744 )
h.df<-cbind(h.df, d.s.muhaz$haz.est)
}
h.df$upper.ci<-apply(h.df[,c(-1,-2)], 1, FUN=function(x) quantile(x, probs = 0.975))
h.df$lower.ci<-apply(h.df[,c(-1,-2)], 1, FUN=function(x) quantile(x, probs = 0.025))
plot(h.df$est, h.df$h.orig, type="l", ylim=c(0,0.003), lwd=3)
lines(h.df$est, h.df$upper.ci, lty=3, lwd=3)
lines(h.df$est, h.df$lower.ci, lty=3, lwd=3)
Setting max.time seems to works, every bootstrap sample hast the same estimating grid points. However the CI obtained, make little sense. Normally I would expect that the intervals are narrow at t=0 and get wider with time (less information, more uncertainty) but the intervals obtained seem to be more or less constant with time.
Bootstrapping provides the answer as the commenter suggested. Your intuitions are right that you should expect the CIs to widen as the number at risk decreases. However, this effect is going to be diminished by the smoothing process and the longer the interval over which smoothing is applied the less you should notice the change in size of the CI. Try smoothing over a sufficiently short interval and you should notice the CIs widen more noticeably.
As you may find, these smoothed hazard plots can be of very limited use and are highly sensitive to how the smoothing is done. As an exercise, it is instructive to simulate survival times from a series of Weibull distributions w/ the shape parameter set to 0.8, 1.0, 1.2, and then look at these smoothed hazard plots and try to categorize them. To the extent that these plots are informative, it should be fairly easy to tell the difference between those three curves based on the trend rate of the hazard function. YMMV, but I haven't been very impressed with the results when I've done this tests with reasonable sample sizes consistent with clinical trials in oncology.
As an alternative to smoothed hazard plots, you might try fitting piecewise exponential curves using the method of Han et al. (http://www.ncbi.nlm.nih.gov/pubmed/23900779) and bootstrapping that. Their algorithm will identify the break points at which there is a statistically significant change in the hazard rate and may give you a better sense of the trend in the hazard rate than smoothed hazard plots. It will also avoid the somewhat arbitrary but consequential choice of smoothing parameters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With