Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Report p-value from lme model (random intercept) with gtsummary using tbl_summary() and add_p()

Tags:

r

gtsummary

I am working in R using gtsummary, trying to report the p-value from the random intercept model

lme(Variable1 ~ Study_Group, random = ~1|ID, na.action = na.omit, data = study_dat) 

in a summary/comparison table using

lme_stat_function <- function(data, variable, by, group, ...){
  nlme::lme(variable ~ by, random = ~1|group, na.action = na.omit, data = data) %>% 
    broom.mixed::tidy()    
}

lme_stat_render <- list(
  all_continuous() ~ "lme_stat_function"
)

study_dat %>% 
  select(ID, Study_Group, Variable1) %>% 
  tbl_summary(
    by = Study_Group,
    type = list(
      where(is.numeric) ~ "continuous2"
    )
  ) %>%
  bold_labels() %>%
  italicize_levels() %>%
  add_p(
    test = lme_stat_render 
  ) 

I have been wrestling with this seemingly simple issue for days and I cannot seem to get anything other than an error that effectively reads For variable Variable1 (Study_Group) and "p.value" statistic: variable lengths differ (found for 'group').

I've been reading through the documentation and can only find pseudo-code for similar scenarios, but not exactly this one. The mixed model above does work, but the function to extract that p-value and report it in the table for the respective variable does not. Here is a look at the general structure of my data:

structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 84, 84, 
85, 86, 87, 87, 88, 88, 89, 89, 90, 90, 91, 92, 93, 93, 94, 95, 
95, 96, 97, 97), Variable1 = c(6, 29, 16, 15, 19, 15, 
4, 8, 26, 9, 7, 27, 3, 31, 16, 19, 8, 12, 5, 7, 7, 12, 9, 54, 
5, 14, 4, 3, 11, 17, 7), Study_Group = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("Group A", 
"Group B"), class = "factor")), row.names = c(NA, -31L
), class = c("tbl_df", "tbl", "data.frame"))

I recognize that the repeated measurements here are only repeated for one group, but I'm not debating statistical methodology here. I'm just trying to get this to work as it is. Any pointers?

like image 356
Carson Keeter Avatar asked Dec 02 '25 19:12

Carson Keeter


1 Answers

You're on the right track. It looks like the issue may be in the construction of the model. In the example below, I've used reformulate() which is a great helper for creating formulas from string input.

library(gtsummary)

study_dat <-
  structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 84, 84, 
                        85, 86, 87, 87, 88, 88, 89, 89, 90, 90, 91, 92, 93, 93, 94, 95, 
                        95, 96, 97, 97), 
                 Variable1 = c(6, 29, 16, 15, 19, 15, 
                               4, 8, 26, 9, 7, 27, 3, 31, 16, 19, 8, 12, 5, 7, 7, 12, 9, 54, 
                               5, 14, 4, 3, 11, 17, 7), 
                 Study_Group = structure(c(1L, 1L, 
                                           1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                           2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("Group A", 
                                                                                                           "Group B"), class = "factor")), row.names = c(NA, -31L
                                                                                                           ), 
            class = c("tbl_df", "tbl", "data.frame"))


# you can construct a function that returns a single 
# row data frame with the p-value in it
# details for writing this function: https://www.danieldsjoberg.com/gtsummary/reference/tests.html#custom-functions
my_lme_test <- function(data, by, variable, group, ...) {
  nlme::lme(
    data = data, 
    fixed = reformulate(response = variable, termlabels = by),
    random = reformulate(glue::glue("1 | {group}"))
  ) |> 
    broom.mixed::tidy() |> 
    dplyr::filter(startsWith(term, by)) # keep the row associated with treatment
}
# test the function
my_lme_test(study_dat, by = 'Study_Group', variable = "Variable1", group = "ID")
#> # A tibble: 1 × 8
#>   effect group term               estimate std.error    df statistic p.value
#>   <chr>  <chr> <chr>                 <dbl>     <dbl> <dbl>     <dbl>   <dbl>
#> 1 fixed  <NA>  Study_GroupGroup B    -2.48      3.94    21    -0.631   0.535


study_dat |> 
  select(ID, Study_Group, Variable1) |> 
  tbl_summary(
    by = Study_Group,
    include = -ID,
    type = where(is.numeric) ~ "continuous2"
  ) |> 
  add_p(test = everything() ~ my_lme_test, group = ID) |> 
  as_kable() # convert to kable to display on SO
Characteristic Group A N = 14 Group B N = 17 p-value
Variable1 0.5
Median (Q1, Q3) 15 (9, 19) 7 (5, 16)

Created on 2025-01-30 with reprex v2.1.1

like image 153
Daniel D. Sjoberg Avatar answered Dec 08 '25 13:12

Daniel D. Sjoberg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!