Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-standard Evaluation using tidyr::expand

Tags:

r

nse

tidyr

I am having trouble running non-standard evaluation (nse) expressions with the tidyr package.

Basically, what I want to do is to expand two columns that may be identical or not to achieve a dataframe with all possible combinations. The problem is that this will be a function, so I will not know the column name in advance.

Here it is a minimum example:

library(tidyr)

dummy <- data.frame(x = c("ex1", "ex2"), y = c('cat1', 'cat2')) # dataset

tidyr::expand(dummy, x, y) # using standard evaluation works
tidyr::expand_(dummy, c("x", "y"))  # using the deprecated syntax works

# The following did not work:

  tidyr::expand(dummy, one_of('x'), y) # using select syntax
  tidyr::expand(dummy, vars('x', 'y')) # mutate_at style
  tidyr::expand(dummy, .data[[cnae_agg]], .data[[cnae_agg]])  # mutate current style  
  tidyr::expand(dummy, sym('x'), sym('y')) # trying to convert to symbols
  tidyr::expand(dummy, !!!enquos('x', 'y')) 
  tidyr::expand(dummy, !!('x'), y) # unquosure just one element
  tidyr::expand(dummy, !!!c("x", "y")) # unquosure vector of strings
  tidyr::expand(dummy, !!!c(quo("x"), quo("y"))) # unquosure vector that is being quosured before

So, I have two questions:

1) What is the correct syntax to be applied with the tidyr expand function?

2) I probably read the Advanced R chapter on quasiquotation several times already, but it is still not clear to me why there are several different 'styles' to use nse with the tidyverse, and where exactly to use each.

I can basically throw pretty much anything to select/summarise that it will work, but when using mutate things react differently.

For example:

  # mutate
  mutate(dummy, new_var = .data[['x']]) # mutate basic style
  mutate(dummy, new_var = !!'x') # this just attributes 'x' to all rows


  # mutate at
  mutate_at(dummy, .vars=vars('y'), list(~'a')) # this works
  mutate_at(dummy, .vars=vars(!!'y'), list(~'a')) # this also works
  mutate_at(dummy, .vars=vars('y'), list(~`<-`(.,!!'x'))) # if we try to use unquote to create an attribution it does not work
  mutate_at(dummy, .vars=vars('y'), list(~`<-`(.,vars(!!'x')))) # even using vars, which works for variable selection, doesnt suffice

  # select 
  select(dummy, x) # this works
  select(dummy, 'x') # this works
  select_at(dummy, vars(!!'x')) # this works
  select_at(dummy, 'x') # this works
  select_at(dummy, !!'x') # this doesnt work

Which brings me to my 2) question.

Is there an updated guide with all the current syntaxes for the tidyverse style focusing on the differences in usage for each 'verb', such as in 'mutate' vs 'select' (i.e. when one works and the other doesn't)?

And how to know if I have to use the mutate or the select style of nse in other tidyverse packages, such as tidyr?

like image 276
Elijah Avatar asked Sep 23 '19 20:09

Elijah


People also ask

What is Tidyr used for in R?

tidyr provides three main functions for tidying your messy data: gather() , separate() and spread() . Sometimes two variables are clumped together in one column. separate() allows you to tease them apart ( extract() works similarly but uses regexp groups instead of a splitting pattern or position).

Which function uses non standard evaluation so that you can directly use the columns of the data frame without typing the name of the data frame many times?

Metaprogramming. The final use of non-standard evaluation is to do metaprogramming. This is a catch-all term that encompasses any function that does computation on an unevaluated expression.

What is the difference between Tidyr and dplyr?

dplyr is a package for making tabular data manipulation easier. tidyr enables you to swiftly convert between different data formats.

What is tidy evaluation?

Tidy evaluation is a framework for controlling how expressions and variables in your code are evaluated by tidyverse functions. This framework, housed in the rlang package, is a powerful tool for writing more efficient and elegant code.


1 Answers

We need to evaluate (!!) the symbols

tidyr::expand(dummy,  !!! syms(c('x', 'y')))
# A tibble: 4 x 2
#  x     y    
#  <fct> <fct>
#1 ex1   cat1 
#2 ex1   cat2 
#3 ex2   cat1 
#4 ex2   cat2 

This would be particularly useful when the column names are stored in a vector and want to do the expand

nm1 <- c('x', 'y')
tidyr::expand(dummy, !!! syms(nm1))

IN some of the other combinations, either the !!! or the conversion to symbol is missing from the character vector

like image 92
akrun Avatar answered Nov 15 '22 07:11

akrun