Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is using list() critical for .dots = setNames() uses in dplyr?

Tags:

r

dplyr

I am calling mutate using dynamic variable names. An example that mostly works is:

df <- data.frame(a = 1:5, b = 1:5)
func <- function(a,b){
  return(a+b)
}
var1 = 'a'
var2 = 'b'
expr <- interp(~func(x, y), x = as.name(var1), y = as.name(var2))
new_name <- "dynamically_created_name"
temp <- df %>% mutate_(.dots = setNames(expr, nm = new_name))

Which produces

temp
  a b func(a, b)
1 1 1          2
2 2 2          4
3 3 3          6
4 4 4          8
5 5 5         10

This is mostly fine except that set names ignored the nm key. This is solved by wrapping my function in list():

temp <- df %>% mutate_(.dots = setNames(list(expr), nm = new_name))
temp
  a b dynamically_created_name
1 1 1                        2
2 2 2                        4
3 3 3                        6
4 4 4                        8
5 5 5                       10

My question is why is setNames ignoring it's key in the first place, and how does list() solve this problem?

like image 390
Daniel Randles Avatar asked Mar 17 '16 17:03

Daniel Randles


1 Answers

As noted in the other answer, the .dots argument is assumed to be a list, and setNames is a convenient way to rename elements in a list.

What is the .dots argument doing? Let's first think about the actual dots ... argument. It is a series of expressions to be evaluated. Below the dots ... are the two named expressions c = ~ a * scale1 and d = ~ a * scale2.

scale1 <- -1
scale2 <- -2

df %>% 
  mutate_(c = ~ a * scale1, d = ~ a * scale2)
#>   a b  c   d
#> 1 1 1 -1  -2
#> 2 2 2 -2  -4
#> 3 3 3 -3  -6
#> 4 4 4 -4  -8
#> 5 5 5 -5 -10

We could just bundle those expressions together beforehand in a list. That's where .dots comes in. That parameter lets us tell mutate_ to evaluate the expressions in the list.

bundled <- list(
  c2 = ~ a * scale1, 
  d2 = ~ a * scale2
)

df %>% 
  mutate_(.dots = bundled)
#>   a b c2  d2
#> 1 1 1 -1  -2
#> 2 2 2 -2  -4
#> 3 3 3 -3  -6
#> 4 4 4 -4  -8
#> 5 5 5 -5 -10

If we want to programmatically update the names of the expressions in the list, then setNames is a convenient way to do that. If we want to programmatically mix and match constants and variable names when making expressions, then the lazyeval package provides convenient ways to do that. Below I do both to create a list of expressions, name them, and evaluate them with mutate_

# Imagine some dropdown boxes in a Shiny app, and this is what user requested
selected_func1 <- "min"
selected_func2 <- "max"
selected_var1 <- "a"
selected_var2 <- "b"

# Assemble expressions from those choices
bundled2 <- list(
  interp(~fun(x), fun = as.name(selected_func1), x = as.name(selected_var1)),
  interp(~fun(x), fun = as.name(selected_func2), x = as.name(selected_var2))
)
bundled2
#> [[1]]
#> ~min(a)
#> 
#> [[2]]
#> ~max(b)

# Create variable names
exp_name1 <- paste0(selected_func1, "_", selected_var1)
exp_name2 <- paste0(selected_func2, "_", selected_var2)

bundled2 <- setNames(bundled2, c(exp_name1, exp_name2))
bundled2
#> $min_a
#> ~min(a)
#> 
#> $max_b
#> ~max(b)

# Evaluate the expressions
df %>% 
  mutate_(.dots = bundled2)
#>   a b min_a max_b
#> 1 1 1     1     5
#> 2 2 2     1     5
#> 3 3 3     1     5
#> 4 4 4     1     5
#> 5 5 5     1     5
like image 145
TJ Mahr Avatar answered Nov 04 '22 16:11

TJ Mahr