Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple Logistic Regression with Interaction between Quantitative and Qualitative Explanatory Variables

As a follow up to this question, I fitted the Multiple Logistic Regression with Interaction between Quantitative and Qualitative Explanatory Variables. MWE is given below:

Type  <- rep(x=LETTERS[1:3], each=5)
Conc  <- rep(x=seq(from=0, to=40, by=10), times=3)
Total <- 50
Kill  <- c(10, 30, 40, 45, 38, 5, 25, 35, 40, 32, 0, 32, 38, 47, 40)

df <- data.frame(Type, Conc, Total, Kill)

fm1 <- 
  glm(
      formula = Kill/Total~Type*Conc
    , family  = binomial(link="logit")
    , data    = df
    , weights = Total
    )

summary(fm1)

Call:
glm(formula = Kill/Total ~ Type * Conc, family = binomial(link = "logit"), 
    data = df, weights = Total)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-4.871  -2.864   1.204   1.706   2.934  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -0.65518    0.23557  -2.781  0.00541 ** 
TypeB       -0.34686    0.33677  -1.030  0.30302    
TypeC       -0.66230    0.35419  -1.870  0.06149 .  
Conc         0.07163    0.01152   6.218 5.04e-10 ***
TypeB:Conc  -0.01013    0.01554  -0.652  0.51457    
TypeC:Conc   0.03337    0.01788   1.866  0.06201 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 277.092  on 14  degrees of freedom
Residual deviance:  96.201  on  9  degrees of freedom
AIC: 163.24

Number of Fisher Scoring iterations: 5

anova(object=fm1, test="LRT")

Analysis of Deviance Table

Model: binomial, link: logit

Response: Kill/Total

Terms added sequentially (first to last)


          Df Deviance Resid. Df Resid. Dev Pr(>Chi)    
NULL                         14    277.092             
Type       2    6.196        12    270.895  0.04513 *  
Conc       1  167.684        11    103.211  < 2e-16 ***
Type:Conc  2    7.010         9     96.201  0.03005 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


df$Pred <- predict(object=fm1, data=df, type="response")

df1 <- with(data=df,
               expand.grid(Type=levels(Type)
                           , Conc=seq(from=min(Conc), to=max(Conc), length=51)
                           )
      )
df1$Pred <- predict(object=fm1, newdata=df1, type="response")

library(ggplot2)
ggplot(data=df, mapping=aes(x=Conc, y=Kill/Total, color=Type)) + geom_point() +
  geom_line(data=df1, mapping=aes(x=Conc, y=Pred), linetype=2) +
  geom_hline(yintercept=0.5,col="gray")

enter image description here

I want to calculate LD50, LD90 and LD95 with their confidence intervals. As the interaction is significant so I want to calculate LD50, LD90 and LD95 with their confidence intervals for each Type (A, B, and C) separately.



LD stands for lethal dose. It is the amount of substance required to kill X% (LD50 = 50%) of the test population.

Edited Type is a qualitative variable representing different types of drugs and Conc is a quantitative variable representing different Concentrations of drugs.

like image 561
MYaseen208 Avatar asked Mar 17 '16 15:03

MYaseen208


2 Answers

You use the drc package to fit logistic dose-response models.

First fit the model

require(drc)
mod <- drm(Kill/Total ~ Conc, 
           curveid = Type, 
           weights = Total, 
           data = df, 
           fct =  L.4(fixed = c(NA, 0, 1, NA)), 
           type = 'binomial')

Here curveid=specifies the grouping of the data and fct= specifies a 4 parameter logistic function, with parameters for lower and upper bond fixed at 0 and 1.

Note the differences to glm are negligible:

df2 <- with(data=df,
            expand.grid(Conc=seq(from=min(Conc), to=max(Conc), length=51),
                        Type=levels(Type)))
df2$Pred <- predict(object=mod, newdata = df2)

Here's a histgramm of the differences to the glm prediction

hist(df2$Pred - df1$Pred)

enter image description here

Estimate Effective Doses (and CI) from the model

This is easy with the ED() function:

ED(mod, c(50, 90, 95), interval = 'delta')

Estimated effective doses
(Delta method-based confidence interval(s))

     Estimate Std. Error   Lower  Upper
A:50   9.1468     2.3257  4.5885 13.705
A:90  39.8216     4.3444 31.3068 48.336
A:95  50.2532     5.8773 38.7338 61.773
B:50  16.2936     2.2893 11.8067 20.780
B:90  52.0214     6.0556 40.1527 63.890
B:95  64.1714     8.0068 48.4784 79.864
C:50  12.5477     1.5568  9.4963 15.599
C:90  33.4740     2.7863 28.0129 38.935
C:95  40.5904     3.6006 33.5334 47.648

For each group we get ED50, ED90 & ED95 with CI.

like image 143
EDi Avatar answered Sep 28 '22 23:09

EDi


Your link function of choice (\eta= X\hat\beta) has variance for a new observation (x_0): V_{x_0}=x_0^T(X^TWX)^{-1}x_0

So, for a set of candidate doses, we can predict the expected percentage of deaths using the inverse function:

newdata= data.frame(Type=rep(x=LETTERS[1:3], each=5),
                    Conc=rep(x=seq(from=0, to=40, by=10), times=3))
mm <- model.matrix(fm1, newdata)

# get link on link terms (could also use predict)
eta0 <- apply(mm, 1, function(i) sum(i * coef(fm1)))

# inverse logit function
ilogit <- function(x) return(exp(x) / (1+ exp(x)))

# predicted probs
ilogit(eta0)


# for comfidence intervals we can use a normal approximation
lethal_dose <- function(mod, newdata, alpha) {
  qn <- qnorm(1 - alpha /2)
  mm <- model.matrix(mod, newdata)
  eta0 <- apply(mm, 1, function(i) sum(i * coef(fm1)))
  var_mod <- vcov(mod)

  se <- apply(mm, 1, function(x0, var_mod) {
    sqrt(t(x0) %*% var_mod %*% x0)}, var_mod= var_mod)

  out <- cbind(ilogit(eta0 - qn * se),
               ilogit(eta0),
               ilogit(eta0 + qn * se))
  colnames(out) <- c("LB_CI", "point_est", "UB_CI")

  return(list(newdata=newdata,
              eff_dosage= out))
}

lethal_dose(fm1, newdata, alpha= 0.05)$eff_dosage
$eff_dosage
       LB_CI point_est     UB_CI
1  0.2465905 0.3418240 0.4517820
2  0.4361703 0.5152749 0.5936215
3  0.6168088 0.6851225 0.7462674
4  0.7439073 0.8166343 0.8722545
5  0.8315325 0.9011443 0.9439316
6  0.1863738 0.2685402 0.3704385
7  0.3289003 0.4044270 0.4847691
8  0.4890420 0.5567386 0.6223914
9  0.6199426 0.6990808 0.7679095
10 0.7207340 0.8112133 0.8773662
11 0.1375402 0.2112382 0.3102215
12 0.3518053 0.4335213 0.5190198
13 0.6104540 0.6862145 0.7531978
14 0.7916268 0.8620545 0.9113443
15 0.8962097 0.9469715 0.9736370

Rather than doing this manually, you could also manipulate:

predict.glm(fm1, newdata, se=TRUE)$se.fit

like image 27
Alex W Avatar answered Sep 29 '22 01:09

Alex W