I'm new to the purrr
paradigm and am struggling with it.
Following a few sources I have managed to get so far as to nest a data frame, run a linear model on the nested data, extract some coefficients from each lm, and generate a summary for each lm. The last thing I want to do is extract the "r.squared" from the summary (which I would have thought would be the simplest part of what I'm trying to achieve), but for whatever reason I can't get the syntax right.
Here's a MWE of what I have that works:
library(purrr)
library(dplyr)
library(tidyr)
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
sum = map(fit, ~summary))
and here's my attempt to extract the r.squared which fails:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
sum = map(fit, ~summary),
rsq = map_dbl(sum, "r.squared"))
Error in eval(substitute(expr), envir, enclos) : `x` must be a vector (not a closure)
This is superficially similar to the example given on the RStudio site:
mtcars %>%
split(.$cyl) %>%
map(~ lm(mpg ~ wt, data = .x)) %>%
map(summary) %>%
map_dbl("r.squared")
This works however I would like the r.squared values to sit in a new column (hence the mutate statement) and I'd like to understand why my code isn't working instead of working-around the problem.
EDIT:
Here's a working solution that I came to using the solutions below:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
summary = map(fit, glance),
r_sq = map_dbl(summary, "r.squared"))
EDIT 2:
So, it actually turns out that the bug is from the inclusion of the tilde key in the summary = map(fit, ~summary) line. My guess is that the makes the object a function which is nest and not the object returned by the summary itself. Would love an authoritative answer on this if someone wants to chime in.
To be clear, this version of the original code works fine:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
summary = map(fit, summary),
r_sq = map_dbl(summary, "r.squared"))
To fit in your current pipe, you'd want to use unnest
along with map
and glance
from the broom
package.
library(tidyr)
library(dplyr)
library(broom)
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .))) %>%
unnest(map(fit, glance))
You'll get more than just the r-squared, and from there you can use select
to drop what you don't need.
If you want to keep the model summaries nested in list-columns:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
summary = map(fit, glance))
If you want to just extract a single value from a nested frame you just need to use map
to the actual value (and not [[
or extract2
as I originally suggested, many thanks for finding that out).
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
summary = map(fit, glance),
r_sq = map_dbl(summary, "r.squared"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With