Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using `broom:::glance` in a dplyr workflow with a single lm object fails

Tags:

r

dplyr

broom

When I use broom:::glance in the following way:

library(dplyr)
library(broom)
mtcars %>% do(model = lm(mpg ~ wt, .)) %>% glance(model)

I get

Error in complete.cases(x) : invalid 'type' (list) of argument

However, when I add a group_by:

mtcars %>% group_by(am) %>% do(model = lm(mpg ~ wt, .)) %>% glance(model)

does give the expected results:

Source: local data frame [2 x 12]
Groups: am

  am r.squared adj.r.squared sigma statistic  p.value df logLik  AIC  BIC deviance df.residual
1  0     0.589         0.565  2.53      24.4 1.25e-04  2  -43.5 93.1 95.9    108.7          17
2  1     0.826         0.810  2.69      52.3 1.69e-05  2  -30.2 66.4 68.1     79.3          11

Am I missing something here, or is it an error in dplyr/broom?

like image 391
Paul Hiemstra Avatar asked Aug 27 '15 13:08

Paul Hiemstra


1 Answers

This is because do, when performed on an ungrouped table, results in a tbl_df rather than a rowwise_df, meaning broom used a different method. I've fixed this in the latest development version, such that you can now do:

mtcars %>% do(model = lm(mpg ~ wt, .)) %>% glance(model)
#>   r.squared adj.r.squared    sigma statistic      p.value df    logLik
#> 1 0.7528328     0.7445939 3.045882  91.37533 1.293959e-10  2 -80.01471
#>        AIC      BIC deviance df.residual
#> 1 166.0294 170.4266 278.3219          30

I hope to have this up on CRAN (broom 0.4) soon, or you can install with devtools::install_github("dgrtwo/broom"). In the meantime, you could also use a temporary grouping column to get the desired behavior:

mtcars %>%
    group_by(g = 1) %>%
    do(model = lm(mpg ~ wt, .)) %>% 
    glance(model)
#> Source: local data frame [1 x 12]
#> Groups: g
#> 
#>   g r.squared adj.r.squared    sigma statistic      p.value df    logLik
#> 1 1 0.7528328     0.7445939 3.045882  91.37533 1.293959e-10  2 -80.01471
#> Variables not shown: AIC (dbl), BIC (dbl), deviance (dbl), df.residual
#>   (int)
like image 187
David Robinson Avatar answered Oct 29 '22 23:10

David Robinson