I know there are several ways to compare regression models. One way it to create models (from linear to multiple) and compare R2, Adjusted R2, etc: <pre class="prettyprint"><code>Mod1: y=b0+b1 Mod2: y=b0+b1+b2 Mod3: y=b0+b1+b2+b3 (etc) </code></pre> I´m aware that some packages could perform a stepwise regression, but I'm trying to analyze that with purrr. I could create several simple linear models (Thanks for this post here), and now I want to Know how can create regression models adding a specific IV to equation: reproducible code <pre class="prettyprint"><code>data(mtcars) library(tidyverse) library(purrr) library(broom) iv_vars <- c("cyl", "disp", "hp") make_model <- function(nm) lm(mtcars[c("mpg", nm)]) fits <- Map(make_model, iv_vars) glance_tidy <- function(x) c(unlist(glance(x)), unlist(tidy(x)[, -1])) t(iv_vars %>% Map(f = make_model) %>% sapply(glance_tidy)) </code></pre> Output <img src="https://i.stack.imgur.com/AufG2.png" alt="output of linear models"> What I want: <pre class="prettyprint"><code>Mod1: mpg ~cyl Mod2: mpg ~cly + disp Mod3: mpg ~ cly + disp + hp </code></pre> Thanks much.

I would begin by creating a list tibble storing your formulae. Then map the model over the formula, and map glance over the models. <pre class="prettyprint"><code>library(tidyverse) library(broom) mtcars %>% as_tibble() formula <- c(mpg ~ cyl, mpg ~ cyl + disp) output <- tibble(formula) %>% mutate(model = map(formula, ~lm(formula = .x, data = mtcars)), glance = map(model, glance)) output$glance output %>% unnest(glance) </code></pre>

Purrr and several multiple regressions in R

Tags:

r

linear-regression

purrr

broom

I know there are several ways to compare regression models. One way it to create models (from linear to multiple) and compare R2, Adjusted R2, etc:

Click to copy

Mod1: y=b0+b1
Mod2: y=b0+b1+b2
Mod3: y=b0+b1+b2+b3 (etc)

I´m aware that some packages could perform a stepwise regression, but I'm trying to analyze that with purrr. I could create several simple linear models (Thanks for this post here), and now I want to Know how can create regression models adding a specific IV to equation:

reproducible code

Click to copy

data(mtcars)
library(tidyverse)
library(purrr)
library(broom)
iv_vars <- c("cyl", "disp", "hp")
make_model <- function(nm) lm(mtcars[c("mpg", nm)])
fits <- Map(make_model, iv_vars)
glance_tidy <- function(x) c(unlist(glance(x)), unlist(tidy(x)[, -1]))
t(iv_vars %>% Map(f = make_model) %>% sapply(glance_tidy))

Output output of linear models

What I want:

Click to copy

Mod1: mpg ~cyl
Mod2: mpg ~cly + disp
Mod3: mpg ~ cly + disp + hp

Thanks much.

260

asked Nov 28 '17 23:11

Luis

2 Answers

I would begin by creating a list tibble storing your formulae. Then map the model over the formula, and map glance over the models.

Click to copy

library(tidyverse)
library(broom)

mtcars %>% as_tibble()

formula <- c(mpg ~ cyl, mpg ~ cyl + disp)

output <-
  tibble(formula) %>% 
  mutate(model = map(formula, ~lm(formula = .x, data = mtcars)),
         glance = map(model, glance))

output$glance

output %>% unnest(glance)

answered Oct 04 '22 23:10

kputschko

You could cumulatively paste over your vector of id_vars to get the combinations you want. I used the code in this answer to do this.

I use the plus sign as the separator between variables to get ready for the formula notation in lm.

Click to copy

cumpaste = function(x, .sep = " ") {
     Reduce(function(x1, x2) paste(x1, x2, sep = .sep), x, accumulate = TRUE)
}

( iv_vars_cum = cumpaste(iv_vars, " + ") )

[1] "cyl"             "cyl + disp"      "cyl + disp + hp"

Then switch the make_model function to use a formula and a dataset. The explanatory variables, separated by the plus sign, get passed to the function after the tilde in the formula. Everything is pasted together, which lm conveniently interprets as a formula.

Click to copy

make_model = function(nm) {
     lm(paste0("mpg ~", nm), data = mtcars)
}

Which we can see works as desired, returning a model with both explanatory variables.

Click to copy

make_model("cyl + disp")

Call:
lm(formula = as.formula(paste0("mpg ~", nm)), data = mtcars)

Coefficients:
(Intercept)          cyl         disp  
   34.66099     -1.58728     -0.02058

You'll likely need to rethink how you want to combine the info together, as you will now how differing numbers of columns due to the increased number of coefficients.

A possible option is to add dplyr::bind_rows to your glance_tidy function and then use map_dfr from purrr for the final output.

Click to copy

glance_tidy = function(x) {
     dplyr::bind_rows( c( unlist(glance(x)), unlist(tidy(x)[, -1]) ) )
}

iv_vars_cum %>% 
     Map(f = make_model) %>% 
     map_dfr(glance_tidy, .id = "model")

# A tibble: 3 x 28

            model r.squared adj.r.squared    sigma statistic      p.value    df    logLik      AIC
            <chr>     <dbl>         <dbl>    <dbl>     <dbl>        <dbl> <dbl>     <dbl>    <dbl>
1             cyl 0.7261800     0.7170527 3.205902  79.56103 6.112687e-10     2 -81.65321 169.3064
2      cyl + disp 0.7595658     0.7429841 3.055466  45.80755 1.057904e-09     3 -79.57282 167.1456
3 cyl + disp + hp 0.7678877     0.7430186 3.055261  30.87710 5.053802e-09     4 -79.00921 168.0184 ...

answered Oct 04 '22 22:10

aosmith

Related questions
                            
                                R: Referencing data.table fields in cut function in j clause
                            
                                Rename column of dataframes inside a list with its dataframe name
                            
                                How to merge lists of vectors based on one vector belonging to another vector?
                            
                                Computation failed in `stat_smooth()`: object 'C_crspl' not found
                            
                                Sort ggplot boxplots by median with facets
                            
                                Programmatically rename data frame columns using lookup data frame
                            
                                How to select elements with the same name from nested list with purrr?
                            
                                String based filtering in dplyr - NSE
                            
                                Element of vector to different columns of data frame
                            
                                apply function to grouped rows in dataframe [duplicate]
                            
                                Using nest and purrr::map outside of mutate
                            
                                Subtracting the first row from all following rows
                            
                                %% Cell magic tag not working in Jupyter notebook?
                            
                                Use a custom icon in plotly's pie chart in R
                            
                                Dual y axis (second axis) use in ggplot2
                            
                                Error: could not find function "read_excel" using R on Mac
                            
                                How to have table in Shiny filled by user?
                            
                                In stats::glm(), why is the subset argument giving different results from when I subset the data argument myself?
                            
                                Calling a function with arguments within dplyr::do using multidplyr
                            
                                Can geom_image() from the ggimage package be made to preserve the image aspect ratio?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Purrr and several multiple regressions in R

Tags:

r

linear-regression

purrr

broom

Luis

People also ask

2 Answers

kputschko

aosmith

Recent Activity

Donate For Us