Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formatted latex regression tables with multiple models from broom output?

I have several models such as the example below for which I have estimates, standard errors, p-values, r2 etc. as data.frames in tidy format, but I don't have the original model objects (analysis was run on a different machine).

require(broom)
model <- lm(mpg ~ hp + cyl, mtcars)
tidy_model <- tidy(model)
glance_model <- glance(model)

# tidy_model
# # A tibble: 3 x 5
#   term        estimate std.error statistic  p.value
#   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
# 1 (Intercept)  36.9       2.19       16.8  1.62e-16
# 2 hp           -0.0191    0.0150     -1.27 2.13e- 1
# 3 cyl          -2.26      0.576      -3.93 4.80e- 4
# glance(model)
# # A tibble: 1 x 11
#   r.squared adj.r.squared sigma ...
# *     <dbl>         <dbl> <dbl>     ...
# 1     0.760         0.743  3.06      ...

There exist several packages (e.g. stargazer or texreg) which transform one or more model objects (lm, glm, etc.) into well-formatted regression tables side-by-side, see below for an example of texreg:

require(texreg)
screenreg(list(model1, model1)
# =================================
#              Model 1    Model 2  
# ---------------------------------
# (Intercept)  34.66 ***  34.66 ***
#              (2.55)     (2.55)   
# cyl          -1.59 *    -1.59 *  
#              (0.71)     (0.71)   
# disp         -0.02      -0.02    
#              (0.01)     (0.01)   
# ---------------------------------
# R^2           0.76       0.76    
# Adj. R^2      0.74       0.74    
# Num. obs.    32         32       
# RMSE          3.06       3.06    
# =================================
# *** p < 0.001, ** p < 0.01, * p < 0.05

Is there a similar package that uses tidy estimation results produced with broom as inputs rather than model objects to produce a table such as the above example?

like image 864
benedikt Avatar asked Aug 26 '18 15:08

benedikt


2 Answers

Is there a similar package that uses tidy estimation results produced with broom as inputs

Not to my knowledge, but stargazer allows you to use custom inputs to generate regression tables. This allows us to create "fake" shell tables that we can populate with values from the tidy table. Using your example

# create fake models
dat <- lapply(tidy_model$term, function(...) rnorm(10))
dat <- as.data.frame(setNames(dat, c("mpg", tidy_model$term[-1])))
f <- as.formula(paste("mpg ~", paste(tidy_model$term[-1], collapse = " + ")))
fit <- lm(f, dat)

# set up model statistics
fit_stats <- data.frame(labels = names(glance_model),
                        mod1 = round(unlist(glance_model), 3),
                        mod2 = round(unlist(glance_model), 3),
                        row.names = NULL,
                        stringsAsFactors = FALSE)

We can then feed these values into stargazer:

library(stargazer)

stargazer(fit, fit, type = "text", 
  coef = list(tidy_model$estimate, tidy_model$estimate),
  se = list(tidy_model$std.error, tidy_model$std.error),
  add.lines = lapply(1:nrow(fit_stats), function(i) unlist(fit_stats[i, ])),
  omit.table.layout = "s"
)
# ==========================================
#                   Dependent variable:     
#               ----------------------------
#                           mpg             
#                    (1)            (2)     
# ------------------------------------------
# hp                -0.019        -0.019    
#                  (0.015)        (0.015)   

# cyl             -2.265***      -2.265***  
#                  (0.576)        (0.576)   

# Constant        36.908***      36.908***  
#                  (2.191)        (2.191)   

# ------------------------------------------
# r.squared         0.741          0.741    
# adj.r.squared     0.723          0.723    
# sigma             3.173          3.173    
# statistic         41.422        41.422    
# p.value             0              0      
# df                  3              3      
# logLik           -80.781        -80.781   
# AIC              169.562        169.562   
# BIC              175.425        175.425   
# deviance         291.975        291.975   
# df.residual         29            29      
# ==========================================
# Note:          *p<0.1; **p<0.05; ***p<0.01
like image 137
Weihuang Wong Avatar answered Oct 31 '22 12:10

Weihuang Wong


I had another look at texreg, inspired by this answer, and there is a more native way to do this by defining an additional extraction method for texreg in addition to the previous answer:

extract_broom <- function(tidy_model, glance_model) {
  # get estimates/standard errors from tidy
  coef <- tidy_model$estimate
  coef.names <- as.character(tidy_model$term)
  se <- tidy_model$std.error
  pvalues <- tidy_model$p.value
  # get goodness-of-fit statistics from glance
  glance_transposed <- as_tibble(cbind(name = names(glance_model), t(glance_model)))
  gof.names <- as.character(glance_transposed$name)
  gof <- as.double(glance_transposed$value)
  gof.decimal <- gof %% 1 > 0
  tr_object <- texreg::createTexreg(coef.names = coef.names,
                                    coef = coef,
                                    se = se,
                                    pvalues = pvalues,
                                    gof.names = gof.names,
                                    gof = gof,
                                    gof.decimal = gof.decimal)
  return(tr_object)
}

This results in the following output:

texreg_model <- extract_broom(tidy_model, glance_model)
screenreg(list(texreg_model, texreg_model))

# =====================================
#                Model 1     Model 2   
# -------------------------------------
# (Intercept)     36.91 ***   36.91 ***
#                 (2.19)      (2.19)   
# hp              -0.02       -0.02    
#                 (0.02)      (0.02)   
# cyl             -2.26 ***   -2.26 ***
#                 (0.58)      (0.58)   
# -------------------------------------
# r.squared        0.74        0.74    
# adj.r.squared    0.72        0.72    
# sigma            3.17        3.17    
# statistic       41.42       41.42    
# p.value          0.00        0.00    
# df               3           3       
# logLik         -80.78      -80.78    
# AIC            169.56      169.56    
# BIC            175.42      175.42    
# deviance       291.97      291.97    
# df.residual     29          29       
# =====================================
# *** p < 0.001, ** p < 0.01, * p < 0.05
like image 43
benedikt Avatar answered Oct 31 '22 11:10

benedikt