Is there a robust way to use a variable that contains a list of strings that correspond to dataframe column names for passing to the various dplyr operations?
I have just been getting into dplyr.
When I try to use operations on a subset of columns in a dataframe, dplyr does great when I name the columns explicitly and one-by-one in comma-separated lists.
This code works as expected
library(dplyr)
# Create dataframe
df <- data.frame(
a = c(1, 1, 1, 2, 2, 2)
, b = c(1, 2, 3, 1, 2, 3)
, c = c(1, 2, 1, 2, 1, 2)
)
# Identify rows where a * c is duplicated
df %>%
select(a, c) %>%
count(a, c) %>%
filter(n > 1)
However, there are times when I already have a list of column names that I would like to pass into the dplyr steps instead of naming each column explicitly. However, I have not found an easy/convenient way to do this that is robust enough to work with several dplyr operations:
This code is not working
# Attempting to do the same with a named list of relevant columns
relevantCols <- c("a", "c")
# Fails
df %>%
select(relevantCols)
# Trying to make new variable based on my relevantCols variable
colsForDplyr <- sapply(relevantCols, eval)
df %>%
# First step succeeds
select(colsForDplyr) %>%
# Fails at count step
count(colsForDplyr)
In the simple example above, it is no big deal to re-type 'a, c' in every dplyr operation. However, if I have a list of columns that is longer, I would rather pass a variable into the dplyr operations instead of re-typing a list of column names over-and-over again.
Any tips on how to achieve this?
I will accept a solution that shows how to create a variable from a list of column names that can be used in various dplyr operations in place of retyping each column name over and over
We can use syms with !!! to pass columns names as a variable.
library(dplyr)
library(rlang)
relevantCols <- c("a", "c")
df %>%
count(!!!syms(relevantCols)) %>%
filter(n > 1)
# a c n
#1 1 1 2
#2 2 2 2
We can use across from dplyr without having to use any other packages
library(dplyr)
df %>%
count(across(all_of(relevantCols))) %>%
filter(n > 1)
# a c n
#1 1 1 2
#2 2 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With