I have a data frame with a bunch of nested data-frames within it, and I'd like to apply dplyr::select to each of those nested data frames. Here's an example
library(tidyverse)
mtcars %>%
group_by(cyl) %>%
nest %>%
mutate(data2 = ~map(data, dplyr::select(.,-mpg)))
I would think that this would result in a data frame with three columns. cyl
: the number of cylinders, data
: the nested data, data2
: the same as data except each element would not have the mpg column.
Instead R crashes:
*** caught segfault *** address 0x7ffc1e445000, cause 'memory not mapped' Traceback: 1: .Call(`_dplyr_mutate_impl`, df, dots) 2: mutate_impl(.data, dots) 3: mutate.tbl_df(., data2 = ~map(data, dplyr::select(., -mpg))) 4: mutate(., data2 = ~map(data, dplyr::select(., -mpg))) 5: function_list[[k]](value) 6: withVisible(function_list[[k]](value)) 7: freduce(value, `_function_list`) 8: `_fseq`(`_lhs`) 9: eval(quote(`_fseq`(`_lhs`)), env, env) 10: eval(quote(`_fseq`(`_lhs`)), env, env) 11: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) 12: mtcars %>% group_by(cyl) %>% nest %>% mutate(data2 = ~map(data, dplyr::select(., -mpg))) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace
I realize I could get the columns I wanted if I apply the select operation before the nesting, but this would be less analogous with my real problem. Could somebody please explain to me what I am doing wrong here? Thanks for any advice.
The select() function of dplyr package is used to select variable names from the R data frame. Use this function if you wanted to select the data frame variables by index or position.
The map functions transform their input by applying a function to each element of a list or atomic vector and returning an object of the same length as the input. map() always returns a list. See the modify() family for versions that return an object of the same type as the input.
map_df() essentially does a bind_rows() and outputs a single dataframe, adding a new variable dist which takes the names of the elements of the list, outputting a long dataframe. Finally this is passed to ggplot() which creates histograms with geom_histogram() , and facets them into six panes with facet_wrap() .
You need to move ~
from map
to select
; or use the comment as @Russ; ~
is used when the function (in this case purrr::map
) accepts a formula as argument:
mtcars %>%
group_by(cyl) %>%
nest %>%
mutate(data2 = map(data, ~ select(., -mpg)))
# A tibble: 3 x 3
# cyl data data2
# <dbl> <list> <list>
#1 6 <tibble [7 × 10]> <tibble [7 × 9]>
#2 4 <tibble [11 × 10]> <tibble [11 × 9]>
#3 8 <tibble [14 × 10]> <tibble [14 × 9]>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With