Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

map a vector of characters to lm formula in r

Tags:

r

purrr

I'm trying to make a list of lm object using purrr::map. use mtcars as an example:

vars <- c('hp', 'wt', 'disp')
map(vars, ~lm(mpg~.x, data=mtcars))

error: Error in model.frame.default(formula = mpg ~ .x, data = mtcars, drop.unused.levels = TRUE) : variable lengths differ (found for '.x')

I also tried:

map(vars, function(x) {x=sym(x); lm(mpg~!!x, data=mtcars)})

I got error message:

Error in !x : invalid argument type

Can anyone tell what I did wrong? Thanks in advance.

like image 808
zesla Avatar asked Aug 24 '17 03:08

zesla


People also ask

What is the LM function in R?

In this article, we will discuss on lm Function in R. lm function helps us to predict data. Let’s consider a situation wherein there is a manufacturing plant of soda bottles and the researcher wants to predict the demand of the soda bottles for the next 5 years. With the help of lm function, we can solve this problem.

What are some examples of modeling functions in R?

Another example of modeling functions and the presence of formulas is nls (), which you would use to make non-linear models: A last example are functions that you can use to build Generalized Linear Models (GLM). In R, you can make use of the glm () function to do this.

Why use formulae in R?

Why Use Formulae in R? There's more to Discover! When you think of it, many functions in R make use of formulas: packages such as ggplot2, stats, lattice, and dplyr all use them! Common examples of functions where you will use these R objects are glm (), lm (), facet_wrap (), etc.

What is an argument in a model in R?

The modeling functions in R are one typical example where you need a formula object as an argument. Other arguments that you might find in these functions are data, which allows you to specify a data frame that you want to attach for the duration of the model, subset to select the data that you want to use, ...


2 Answers

The usual way is to paste together formulas as strings, convert them by mapping as.formula (you can't make a vector of formulas; it has to be a list), and then map lm. You can combine it all to a single call if you like, but I've come to prefer mapping single functions, which makes code easier to read:

library(purrr)

c('hp', 'wt', 'disp') %>% 
    paste('mpg ~', .) %>% 
    map(as.formula) %>% 
    map(lm, data = mtcars)
#> [[1]]
#> 
#> Call:
#> .f(formula = .x[[i]], data = ..1)
#> 
#> Coefficients:
#> (Intercept)           hp  
#>    30.09886     -0.06823  
#> 
#> 
#> [[2]]
#> 
#> Call:
#> .f(formula = .x[[i]], data = ..1)
#> 
#> Coefficients:
#> (Intercept)           wt  
#>      37.285       -5.344  
#> 
#> 
#> [[3]]
#> 
#> Call:
#> .f(formula = .x[[i]], data = ..1)
#> 
#> Coefficients:
#> (Intercept)         disp  
#>    29.59985     -0.04122

It's actually unnecessary to call map(as.formula) as lm will coerce it into a formula, but not all models are so generous (e.g. mgcv::gam).

A downside of this approach are that the call listed in the object looks funky, but the coefficients tell you which is which easily enough anyway. A useful alternative is to keep the formula as a string in one column of a data.frame and the model in a list column, e.g.

library(tidyverse)

data_frame(formula = paste('mpg ~', c('hp', 'wt', 'disp')), 
           model = map(formula, lm, data = mtcars))
#> # A tibble: 3 x 2
#>      formula    model
#>        <chr>   <list>
#> 1   mpg ~ hp <S3: lm>
#> 2   mpg ~ wt <S3: lm>
#> 3 mpg ~ disp <S3: lm>
like image 69
alistaire Avatar answered Sep 28 '22 16:09

alistaire


The elegant tidyverse approach demonstrated by @alistaire worked well for me until I tried to pass the list column to the stargazer package and received "% Error: Unrecognized object type."

In case it is helpful for anyone else trying to use purrr map and stargazer, this slight modification solved the issue:

models_out <- data_frame(
    formula = paste('mpg ~', c('hp', 'wt', 'disp')), 
    model = map(
                .x = formula, 
                .f = function(x) lm(x, data = mtcars))
                )

stargazer(models_out$model, type = 'text')

===========================================================
                                   Dependent variable:     
                              -----------------------------
                                           mpg             
                                 (1)       (2)       (3)   
-----------------------------------------------------------
hp                            -0.068***                    
                               (0.010)                     
                                                           
wt                                      -5.344***          
                                         (0.559)           
                                                           
disp                                              -0.041***
                                                   (0.005) 
                                                           
Constant                      30.099*** 37.285*** 29.600***
                               (1.634)   (1.878)   (1.230) 
                                                           
-----------------------------------------------------------
Observations                     32        32        32    
R2                              0.602     0.753     0.718  
Adjusted R2                     0.589     0.745     0.709  
Residual Std. Error (df = 30)   3.863     3.046     3.251  
F Statistic (df = 1; 30)      45.460*** 91.375*** 76.513***
===========================================================
Note:                           *p<0.1; **p<0.05; ***p<0.01
like image 41
Omar Wasow Avatar answered Sep 28 '22 16:09

Omar Wasow