I'm experimenting with dplyr
, tidyr
and purrr
. I have data like this:
library(tidyverse)
set.seed(123)
df <- data_frame(X1 = rep(LETTERS[1:4], 6),
X2 = sort(rep(1:6, 4)),
ref = sample(1:50, 24),
sampl1 = sample(1:50, 24),
var2 = sample(1:50, 24),
meas3 = sample(1:50, 24))
Now dplyr
is awesome because I can do things like mutate_at()
to manipulate multiple columns at once. e.g:
df <- df %>%
mutate_at(vars(-one_of(c("X1", "X2", "ref"))), funs(first = . - ref)) %>%
mutate_at(vars(contains("first")), funs(second = . *2 ))
and tidyr
allows me nest subsets of the data as sub-tables in a single column:
df <- df %>% nest(-X1)
and thanks to purrr
I can summarize these sub-tables while retaining the original data in the nested column:
df %>% mutate(mean = map_dbl(data, ~ mean(.x$meas3_first_second)))
How can I use purrr
and mutate_at()
to generate multiple summary columns (take the means of different (but not all) columns in each nested sub-table)?
In this example I'd like to take the mean of every column with the word "second" in it.I had hoped that this might produce a new nested column which I could then unnest()
but it does not work.
df %>% mutate(mean = map(data, ~ mutate_at(vars(contains("second")),
funs(mean_comp_exp = mean(.)))))
How can I achieve this?
The comment by @aosmith was correct and helpful In addition I realised I needed to use summarise_at()
and not mutate_at()
like so:
df %>%
mutate(mean = map(data, ~ summarise_at(.x, vars(contains("second")),
funs(mean_comp_exp = mean(.) )))) %>%
unnest(mean)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With