Is there a function in dplyr that allows you to test the same condition against a selection of columns?
Take the following dataframe:
Demo1 <- c(8,9,10,11)
Demo2 <- c(13,14,15,16)
Condition <- c('A', 'A', 'B', 'B')
Var1 <- c(13,76,105,64)
Var2 <- c(12,101,23,23)
Var3 <- c(5,5,5,5)
df <- as.data.frame(cbind(Demo1, Demo2, Condition, Var1, Var2, Var3), stringsAsFactors = F)
df[4:6] <- lapply(df[4:6], as.numeric)
I want to take all the rows in which there is at least one value greater than 100 in any of Var1, Var2, or Var3. I realise that I could do this with a series of or statements, like so:
df <- df %>%
filter(Var1 > 100 | Var2 > 100 | Var3 > 100)
However, since I have quite a few columns in my actual dataset this would be time-consuming. I am assuming that there is some reasonably straightforward way to do this but haven't been able to find a solution on SO.
We can do this with filter_at
and any_vars
df %>%
filter_at(vars(matches("^Var")), any_vars(.> 100))
# Demo1 Demo2 Condition Var1 Var2 Var3
#1 9 14 A 76 101 5
#2 10 15 B 105 23 5
Or using base R
, create a logical expression with lapply
and Reduce
and subset the rows
df[Reduce(`|`, lapply(df[grepl("^Var", names(df))], `>`, 100)),]
In base-R
one can write the same filter using rowSums
as:
df[rowSums((df[,grepl("^Var",names(df))] > 100)) >= 1, ]
# Demo1 Demo2 Condition Var1 Var2 Var3
# 2 9 14 A 76 101 5
# 3 10 15 B 105 23 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With