Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paste variable name in mutate (dplyr)

Tags:

r

dplyr

I try to create a variable with paste() in a mutate_() function (dplyr).

I try to adapt code with this answer (dplyr - mutate: use dynamic variable names) but it doesn't work ...

NB: nameVarPeriod1 is a param of a function.

nameVarPeriod1=A2
df <- df %>%
    group_by(segment) %>%
    mutate_((.dots=setNames(mean(paste0("Sum",nameVarPeriod1)), paste0("MeanSum",nameVarPeriod1))))

This returns a warning :

Warning message:
In mean.default(paste0("Sum", nameVarPeriod1)) :
  argument is not numeric or logical: returning NA

How to evaluate the string in paste0 as variable name ?

When I replace the paste0 by this it works fine :

df <- df %>%
    group_by(segment) %>%
    mutate(mean=mean(SumA2))

DATA :

structure(list(segment = structure(c(5L, 1L, 4L, 2L, 2L, 14L, 
11L, 6L, 14L, 1L), .Label = c("Seg1", "Seg2", "Seg3", "Seg4", 
"Seg5", "Seg6", "Seg7", "Seg8", "Seg9", "Seg10", "Seg11", "Seg12", 
"Seg13", "Seg14"), class = "factor"), SumA2 = c(107584.9, 127343.87, 
205809.54, 138453.4, 24603.46, 44444.39, 103672, 88695.8, 64400, 
36815.82)), .Names = c("segment", "SumA2"), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))
like image 581
Cox Tox Avatar asked Jun 01 '18 09:06

Cox Tox


2 Answers

dplyr 0.7.0 onwards does not require use of mutate_. Here is a solution using := to dynamically assign variable names and helper functions quo name.

It will be helpful to read vignette("programming", "dplyr") for more info. See also Use dynamic variable names in `dplyr` for older versions of dplyr.

df <- df %>%
  group_by(segment) %>%
  mutate( !!paste0('MeanSum',quo_name(nameVarPeriod1)) := 
mean(!!as.name(paste0('Sum',quo_name(nameVarPeriod1)))))

dplyr 1.0.0 alternative:

Using the new across function in dplyr 1.0.0 we can set names using glue style syntax and can include the function name and original column as part of the name:

my_fn <- function(nameVarPeriod1 = 'A2'){
  col_list <- paste0('Sum',nameVarPeriod1)
  df %>% 
    group_by(segment) %>%
    mutate(across(col_list, list(mean=mean), .names = "{fn}{col}"))
}

my_fn()
#   segment   SumA2 meanSumA2
#   <fct>     <dbl>     <dbl>
# 1 Seg5    107585.   107585.
# 2 Seg1    127344.    82080.
# 3 Seg4    205810.   205810.
# 4 Seg2    138453.    81528.
# 5 Seg2     24603.    81528.
# 6 Seg14    44444.    54422.
# 7 Seg11   103672    103672 
# 8 Seg6     88696.    88696.
# 9 Seg14    64400     54422.
#10 Seg1     36816.    82080.
like image 158
Chris Avatar answered Nov 16 '22 01:11

Chris


Not sure what is your purpose of renaming summarized column name with original column names. But if you are looking for a solution where you want to have sum of multiple columns and hence wants to rename those then dplyr::mutate_at does it for you.

library(dplyr)
df %>% group_by(segment) %>%
  mutate(SumA3 = SumA2) %>%     #Added another column to demonstrate 
  mutate_at(vars(starts_with("SumA")), funs(mean = "mean"))

#  segment  SumA2  SumA3 SumA2_mean SumA3_mean
# <fctr>   <dbl>  <dbl>      <dbl>      <dbl>
# 1 Seg5    107585 107585     107585     107585
# 2 Seg1    127344 127344      82080      82080
# 3 Seg4    205810 205810     205810     205810
# 4 Seg2    138453 138453      81528      81528
# 5 Seg2     24603  24603      81528      81528
# 6 Seg14    44444  44444      54422      54422
# 7 Seg11   103672 103672     103672     103672
# 8 Seg6     88696  88696      88696      88696
# 9 Seg14    64400  64400      54422      54422
# 10 Seg1     36816  36816      82080      82080
like image 3
MKR Avatar answered Nov 16 '22 01:11

MKR