Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr, do(), extracting parameters from model without losing grouping variable

Tags:

r

dplyr

A slightly changed example from the R help for do():

by_cyl <- group_by(mtcars, cyl)
models <- by_cyl %>% do(mod = lm(mpg ~ disp, data = .))
coefficients<-models %>% do(data.frame(coef = coef(.$mod)[[1]]))

In the dataframe coefficients, there is the first coefficient of the linear model for each cyl group. My question is how can I produce a dataframe that contains not only a column with the coefficients, but also a column with the grouping variable.

===== Edit: I extend the example to try to make more clear my problem

Let's suppose that I want to extract the coefficients of the model and some prediction. I can do this:

by_cyl <- group_by(mtcars, cyl)
getpars <- function(df){
  fit <- lm(mpg ~ disp, data = df)
  data.frame(intercept=coef(fit)[1],slope=coef(fit)[2])
}
getprediction <- function(df){
  fit <- lm(mpg ~ disp, data = df)
  x <- df$disp
  y <- predict(fit, data.frame(disp= x), type = "response")
  data.frame(x,y)
}
pars <- by_cyl %>% do(getpars(.))
prediction <- by_cyl %>% do(getprediction(.))

The problem is that the code is redundant because I am fitting the model two times. My idea was to build a function that returns a list with all the information:

getAll <- function(df){
  results<-list()
  fit <- lm(mpg ~ disp, data = df)
  x <- df$disp
  y <- predict(fit, data.frame(disp= x), type = "response")

  results$pars <- data.frame(intercept=coef(fit)[1],slope=coef(fit)[2])
  results$prediction <- data.frame(x,y)

  results
 }

The problem is that I don't know how to use do() with the function getAll to obtain for example just a dataframe with the parameters (like the dataframe pars).

like image 970
danilinares Avatar asked Jul 05 '14 15:07

danilinares


2 Answers

Like this?

coefficients <-models %>% do(data.frame(coef = coef(.$mod)[[1]], group = .[[1]]))

yielding

        coef group
  1 40.87196     4
  2 19.08199     6
  3 22.03280     8
like image 182
Robert Krzyzanowski Avatar answered Sep 22 '22 02:09

Robert Krzyzanowski


Using the approach of Hadley Wickham in this video:

library(dplyr)
library(purrr)
library(broom)

fitmodel <- function(d) lm(mpg ~ disp, data = d)
by_cyl <- mtcars %>% 
  group_by(cyl) %>% 
  nest() %>%
  mutate(mod = map(data, fitmodel), 
         pars = map(mod, tidy), 
         pred = map(mod, augment))

pars <- by_cyl %>% unnest(pars)
prediction <- by_cyl %>% unnest(pred)
like image 45
danilinares Avatar answered Sep 24 '22 02:09

danilinares