I have an accelerated failure time model in SAS LIFEREG that I'd like to plot. Because SAS is to profoundly bad at graphing, I'd like to actually re-generate the data for the curves in R and plot them there. SAS puts out a scale (in the case of the exponential distribution fixed to 1), an intercept, and a regression coefficient for being in the exposed or unexposed population.
There's two curves, one for the exposed and one for the unexposed population. One of the models is an exponential distribution, and I've produced the data and graph like so:
intercept <- 5.00
effect<- -0.500
data<- data.frame(time=seq(0:180)-1)
data$s_unexposed <- apply(data,1,function(row) exp(-(exp(-intercept))*row[1]))
data$s_exposed <- apply(data,1,function(row) exp(-(exp(-(intercept+effect))*row[1])))
plot(data$time,data$s_unexposed, type="l", ylim=c(0,1) ,xaxt='n',
xlab="Days since Infection", ylab="Percent Surviving", lwd=2)
axis(1, at=c(0, 20, 40, 60, 80, 100, 120, 140, 160, 180))
lines(data$time,data$s_exposed, col="red",lwd=2)
legend("topright", c("ICU Patients", "Non-ICU Patients"), lwd=2, col=c("red","black") )
Which gives me this:
Not the prettiest graph ever, but I don't really know my way around ggplot2 enough to spruce it up. But more importantly, I have a second set of data that comes from a Log Normal distribution, rather than an exponential, and my attempts to generate the data for that have failed utterly - the incorporation of the cdf for the normal distribution and the like puts it beyond my R skills.
Anyone able to point me in the right direction, using the same numbers, and a scale parameter of 1?
The method is simple: you use the RAND function to generate X ~ N(μ, σ), then compute Y = exp(X). The random variable Y is lognormally distributed with parameters μ and σ. This is the standard definition, but notice that the parameters are specified as the mean and standard deviation of X = log(Y).
To plot the probability density function for a log normal distribution in R, we can use the following functions: dlnorm(x, meanlog = 0, sdlog = 1) to create the probability density function. curve(function, from = NULL, to = NULL) to plot the probability density function.
Use distribution objects to inspect the relationship between normal and lognormal distributions. Create a lognormal distribution object by specifying the parameter values. Compute the mean of the lognormal distribution. The mean of the lognormal distribution is not equal to the mu parameter.
The survival function at time t for a log-normal model can be represented in R with 1 - plnorm()
, where plnorm()
is the log-normal cumulative distribution function. To illustrate, we'll first put your plot into a function for convenience:
## Function to plot aft data
plot.aft <- function(x, legend = c("ICU Patients", "Non-ICU Patients"),
xlab = "Days since Infection", ylab="Percent Surviving", lwd = 2,
col = c("red", "black"), at = c(0, 20, 40, 60, 80, 100, 120, 140, 160, 180),
...)
{
plot(x[, 1], x[, 2], type = "l", ylim = c(0, 1), xaxt = "n",
xlab = xlab, ylab = ylab, col = col[2], lwd = 2, ...)
axis(1, at = at)
lines(x[, 1], x[, 3], col = col[1], lwd=2)
legend("topright", legend = legend, lwd = lwd, col = col)
}
Next, we'll specify the coefficients, variables, and models, and then generate the survival probabilities for the exponential and log-normal models:
## Specify coefficients, variables, and linear models
beta0 <- 5.00
beta1 <- -0.500
icu <- c(0, 1)
t <- seq(0, 180)
linmod <- beta0 + (beta1 * icu)
names(linmod) <- c("unexposed", "exposed")
## Generate s(t) from exponential AFT model
s0.exp <- dexp(exp(-linmod["unexposed"]) * t)
s1.exp <- dexp(exp(-linmod["exposed"]) * t)
## Generate s(t) from lognormal AFT model
s0.lnorm <- 1 - plnorm(t, meanlog = linmod["unexposed"])
s1.lnorm <- 1 - plnorm(t, meanlog = linmod["exposed"])
Finally, we can plot the survival probabilities:
## Plot survival
plot.aft(data.frame(t, s0.exp, s1.exp), main = "Exponential model")
plot.aft(data.frame(t, s0.lnorm, s1.lnorm), main = "Log-normal model")
And the resulting figures:
Note that
plnorm(t, meanlog = linmod["exposed"])
is the same as
pnorm((log(t) - linmod["exposed"]) / 1)
which is the Φ in the canonical equation for the log-normal survival function: S(t) = 1 − Φ((ln(t) − µ) / σ)
As I'm sure you know, there are a number of R packages that can handle accelerated failure time models with left, right, or interval censoring, as listed in the survival task view, in case you happen to develop a preference for R over SAS.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With