Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using mutate_at() on a nested dataframe column to generate multiple unnested columns

I'm experimenting with dplyr, tidyr and purrr. I have data like this:

library(tidyverse)

set.seed(123)
df <- data_frame(X1 = rep(LETTERS[1:4], 6),
                 X2 = sort(rep(1:6, 4)),
                 ref = sample(1:50, 24),
                 sampl1 = sample(1:50, 24),
                 var2 = sample(1:50, 24),
                 meas3 = sample(1:50, 24))

Now dplyr is awesome because I can do things like mutate_at() to manipulate multiple columns at once. e.g:

df <- df %>% 
  mutate_at(vars(-one_of(c("X1", "X2", "ref"))), funs(first = . - ref)) %>% 
  mutate_at(vars(contains("first")),  funs(second = . *2 ))

and tidyr allows me nest subsets of the data as sub-tables in a single column:

df <- df %>% nest(-X1) 

and thanks to purrr I can summarize these sub-tables while retaining the original data in the nested column:

df %>% mutate(mean = map_dbl(data, ~ mean(.x$meas3_first_second)))

How can I use purrr and mutate_at() to generate multiple summary columns (take the means of different (but not all) columns in each nested sub-table)?

In this example I'd like to take the mean of every column with the word "second" in it.I had hoped that this might produce a new nested column which I could then unnest() but it does not work.

df %>% mutate(mean = map(data, ~ mutate_at(vars(contains("second")),
                                           funs(mean_comp_exp = mean(.)))))

How can I achieve this?

like image 300
G_T Avatar asked Oct 29 '22 03:10

G_T


1 Answers

The comment by @aosmith was correct and helpful In addition I realised I needed to use summarise_at() and not mutate_at() like so:

df %>% 
    mutate(mean = map(data, ~ summarise_at(.x, vars(contains("second")),
                                               funs(mean_comp_exp = mean(.) )))) %>%
    unnest(mean)
like image 56
G_T Avatar answered Nov 15 '22 05:11

G_T