Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fit multiple models on multiple dataset in purrr?

Tags:

r

purrr

I have the following tibble

    tribble(
  ~func, ~models, ~data,
  'lm'  ,  formula = mpg ~ disp, mtcars,
  'lm'  ,  formula = mpg ~ disp, filter(mtcars, carb < 4)
)

Now I would like to fit rowise the model type specified in func with formula models on a dataset data. I was tryin to use invoke like this but it doesn't work:

   tribble(
  ~func, ~models, ~data,
  'lm'  ,  formula = mpg ~ disp, mtcars,
  'lm'  ,  formula = mpg ~ disp, filter(mtcars, carb < 4)
)%>% invoke_map(func, list(models, data))
like image 680
Dambo Avatar asked Sep 20 '17 22:09

Dambo


2 Answers

invoke_map can work on a dataset like this if the list of parameters for each model is in a single variable.

That would look something like the following. Note the use of curly braces so I can call each column from the dataset within the pipe chain.

tribble(~func, ~params,
        'lm'  ,  list(formula = mpg ~ disp, data = mtcars),
        'lm'  ,  list(formula = mpg ~ disp, data = filter(mtcars, carb < 4) ) ) %>% 
     {invoke_map(.$func, .$params)}

If you need to go from your current tribble with arguments in multiple columns to arguments as a list in a single column, you could do something like

tribble(~func, ~models, ~data,
        'lm'  ,  formula = mpg ~ disp, mtcars,
        'lm'  ,  formula = mpg ~ disp, filter(mtcars, carb < 4) ) %>%
     mutate(params = pmap(list(models, data), list) ) %>% 
     {invoke_map(.$func, .$params)}

If your ultimate goal was to add the models fits to the dataset you could use invoke_map within mutate.

tribble(~func, ~models, ~data,
        'lm'  ,  formula = mpg ~ disp, mtcars,
        'lm'  ,  formula = mpg ~ disp, filter(mtcars, carb < 4) ) %>%
     mutate(params = pmap(list(models, data), list),
            fit = invoke_map(func, params ) )
like image 65
aosmith Avatar answered Nov 14 '22 22:11

aosmith


We want to set up a list of lists that contains all the arguments in parallel, then call a single function that takes all of them. We'll do this with pmap. Conveniently, the modelr package has fit_with that takes a modelling function, a formula, and a dataset. Since pmap returns a list of lists, we'll flatten it into a list-column in the dataframe.

tribble(~funcs,  ~models,       ~dat,
        glm,    "len ~ dose",   ToothGrowth,
        lm,     "len ~ dose",   filter(ToothGrowth, supp == "VC")) %>% 
  mutate(fit = flatten(pmap(.l = list(.f = funcs, .formulas = models, data = dat), 
                            .f = modelr::fit_with))) 
# A tibble: 2 x 4
   funcs     models                   dat       fit
  <list>      <chr>                <list>    <list>
1  <fun> len ~ dose <data.frame [60 x 3]> <S3: glm>
2  <fun> len ~ dose <data.frame [30 x 3]>  <S3: lm>

You can use the list-column fit in various model tidying functions from broom.

library(broom)

tribble(~funcs,  ~models,       ~dat,
        glm,    "len ~ dose",   ToothGrowth,
        lm,     "len ~ dose",   filter(ToothGrowth, supp == "VC")) %>% 
  mutate(fit = flatten(pmap(.l = list(.f = funcs, .formulas = models, data = dat), 
                            .f = modelr::fit_with))) %>% 
  do(map_dfr(.$fit, tidy, .id = "dataset"))
  dataset        term  estimate std.error statistic      p.value
1       1 (Intercept)  7.422500 1.2600826  5.890487 2.064211e-07
2       1        dose  9.763571 0.9525329 10.250114 1.232698e-14
3       2 (Intercept)  3.295000 1.4270601  2.308943 2.854201e-02
4       2        dose 11.715714 1.0787561 10.860392 1.509369e-11

Update

Another approach, more similar to your first one:

tribble(~funcs,  ~models,       ~dat,
        "glm",    len ~ dose,   ToothGrowth,
        "lm",     len ~ dose,   filter(ToothGrowth, supp == "VC")) %>% 
  rowwise() %>% 
  mutate(fit = invoke_map(.f = funcs, .x = list(list(formula = models, data = dat)))) %>% 
  {map_dfr(.$fit, tidy, .id = "dataset")}

Note the use of quotes around the function names and the use of rowwise to make each list-element of the list of lists (.x) be length 1.

like image 37
Brian Avatar answered Nov 14 '22 22:11

Brian