Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr mutate using dynamic variable name while respecting group_by

Tags:

r

dplyr

I'm trying as per

dplyr mutate using variable columns & dplyr - mutate: use dynamic variable names

to use dynamic names in mutate. What I am trying to do is to normalize column data by groups subject to a minimum standard deviation. Each column has a different minimum standard deviation

e.g. (I omitted loops & map statements for convenience)

require(dplyr)
require(magrittr)
data(iris)
iris <- tbl_df(iris)

minsd <- c('Sepal.Length' = 0.8)
varname <- 'Sepal.Length'

iris %>% group_by(Species) %>% mutate(!!varname := mean(pluck(iris,varname),na.rm=T)/max(sd(pluck(iris,varname)),minsd[varname]))

I got the dynamic assignment & variable selection to work as suggested by the reference answers. But group_by() is not respected which, for me at least, is the main benefit of using dplyr here

desired answer is given by

iris %>% group_by(Species) %>% mutate(!!varname := mean(Sepal.Length,na.rm=T)/max(sd(Sepal.Length),minsd[varname]))

Is there a way around this?

like image 275
hjw Avatar asked Apr 19 '18 05:04

hjw


2 Answers

I actually did not know much about pluck, so I don't know what went wrong, but I would go for this and this works:

iris %>% 
  group_by(Species) %>% 
  mutate(
    !! varname :=
      mean(!!as.name(varname), na.rm = T) / 
      max(sd(!!as.name(varname)),
          minsd[varname])
  )

Let me know if this isn't what you were looking for.

like image 147
Kim Avatar answered Oct 24 '22 00:10

Kim


The other answer is obviously the best and it also solved a similar problem that I have encountered. For example, with !!as.name(), there is no need to use group_by_() (or group_by_at or arrange_() (or arrange_at()).

However, another way is to replace pluck(iris,varname) in your code with .data[[varname]]. The reason why pluck(iris,varname) does not work is that, I suppose, iris in pluck(iris,varname) is not grouped. However, .data refer to the tibble that executes mutate(), and so is grouped.

An alternative to as.name() is rlang::sym() from the rlang package.

like image 26
L. Francis Cong Avatar answered Oct 24 '22 02:10

L. Francis Cong