So I'm trying to do some programming in dplyr and I am having some trouble with the enquo and !! evaluations.
Basically I would like to mutate a column to a dynamic column name, and then be able to further manipulate that column (i.e. summarize). For instance:
my_function <- function(data, column) {
quo_column <- enquo(column)
new_col <- paste0(quo_column, "_adjusted")[2]
data %>%
mutate(!!new_col := (!!quo_column) + 1)
}
my_function(iris, Petal.Length)
This works great and returns a column called "Petal.Length.adjusted" which is just Petal.Length increased by one.
However I can't seem to summarize this new column.
my_function <- function(data, column) {
quo_column <- enquo(column)
new_col <- paste0(quo_column, "_adjusted")[2]
mean_col <- paste0(quo_column, "_meanAdjusted")[2]
data %>%
mutate(!!new_col := (!!quo_column) + 1) %>%
group_by(Species) %>%
summarize(!!mean_col := mean(!!new_col))
}
my_function(iris, Petal.Length)
This results in a warning stating the argument "Petal.Length_adjusted" is not numeric or logical, although the output from the mutate call gives a numeric column.
How do I reference this dynamically generated column name to pass it in further dplyr functions?
To rename a column in R, you can use the rename() function from dplyr. For example, if you want to rename the column “A” to “B” again, you can run the following code: rename(dataframe, B = A) .
The summarise_all method in R is used to affect every column of the data frame. The output data frame returns all the columns of the data frame where the specified function is applied over every column. Arguments : data – The data frame to summarise the columns of.
To perform summarise on multiple columns, create a vector with the column names and use it with across() function. This example does the group by on department and state columns, summarises on salary & bonus columns, and apply the sum function on each summarised column.
Unlike the quo_column
which is a quosure
, the new_col
and mean_col
are strings, so we convert it to symbol using sym
(from rlang
) and then do the evaluation
my_function <- function(data, column) {
quo_column <- enquo(column)
new_col <- paste0(quo_column, "_adjusted")[2]
mean_col <- paste0(quo_column, "_meanAdjusted")[2]
data %>%
mutate(!!new_col := (!!quo_column) + 1) %>%
group_by(Species) %>%
summarise(!!mean_col := mean(!! rlang::sym(new_col)))
}
head(my_function(iris, Petal.Length))
# A tibble: 3 x 2
# Species Petal.Length_meanAdjusted
# <fct> <dbl>
#1 setosa 2.46
#2 versicolor 5.26
#3 virginica 6.55
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With