Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lookup function for mutate in data

Tags:

r

dplyr

I'd like to store functions, or at least their names, in a column of a data.frame for use in a call to mutate. A simplified broken example:

library(dplyr)
expand.grid(mu = 1:5, sd = c(1, 10), stat = c('mean', 'sd')) %>%
  group_by(mu, sd, stat) %>%
  mutate(sample = get(stat)(rnorm(100, mu, sd))) %>%
  ungroup()

If this worked how I thought it would, the value of sample would be generated by the function in .GlobalEnv corresponding to either 'mean' or 'sd', depending on the row.

The error I get is:

 Error in mutate_impl(.data, dots) : 
    Evaluation error: invalid first argument. 

Surely this has to do with non-standard evaluation ... grrr.

like image 332
Ian Avatar asked Feb 27 '26 04:02

Ian


1 Answers

A few issues here. First expand.grid will convert character values to factors. And get() doesn't like working with factors (ie get(factor("mean")) will give an error). The tidyverse-friendly version is tidyr::crossing(). (You could also pass stringsAsFactors=FALSE to expand.grid.)

Secondly, mutate() assumes that all functions you call are vectorized, but functions like get() are not vectorized, they need to be called one-at-a-time. A safer way rather than doing the group_by here to guarantee one-at-a-time evaluation is to use rowwise().

And finally, your real problem is that you are trying to call get("sd") but when you do, sd also happens to be a column in your data.frame that is part of the mutate. Thus get() will find this sd first, and this sd is just a number, not a function. You'll need to tell get() to pull from the global environment explicitly. Try

tidyr::crossing(mu = 1:5, sd = c(1, 10), stat = c('mean', 'sd')) %>%
  rowwise() %>% 
  mutate(sample = get(stat, envir = globalenv())(rnorm(100, mu, sd)))
like image 168
MrFlick Avatar answered Mar 01 '26 20:03

MrFlick



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!