Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clarity on purrr syntax

Tags:

r

purrr

I often find myself making incorrect choices in variables names when using purrr.

For example, take the code on the github page of purrr.

library(purrr)

mtcars %>%
  split(.$cyl)

in split(.$cyl) I often make the mistake of using split(cyl). This seems to be the most obvious choice as it is consistent with other tidyverse commands such as select(cyl).

My question is why the .$ in front of the variable name.

like image 654
Alex Avatar asked Mar 06 '18 12:03

Alex


1 Answers

The . represents the data object and by using $ it is extracting the column. It can also take in

mtcars %>%
    split(.[['cyl']]

With in the mutate/summarise/group_by/select/arrange etc. we can simply pass the column names, but there it is different as split is a base R function and it cannot find the environment of the dataset where the column 'cyl' is unless we extract the column

One option we can do in tidyverse is to nest all other variables except 'cyl' i.e.

mtcars %>%
    nest(-cyl) 

Now, we have a list column named 'data' which contains all the other columns as a list of 'data.frame`s


With new versions of dplyr (0.8.1 tested), there is group_split as commented by @Moody_Mudskipper

mtcars %>%
       group_split(cyl)
like image 152
akrun Avatar answered Oct 18 '22 01:10

akrun