I'm trying to find a way to filter rows using both the partial name of the column headers and a mathematical expression (x > 0). Given my data here:
OTU_ID X3_22L15_S X3_22T10_W X3_22L6_S X3_22Algae
1 denovo109 16 0 9 0
2 denovo147 44 484 28 0
3 denovo297 0 0 7 14
4 denovo1013 0 1 0 0
I want to include only the rows that have a value > 0 in columns with headers that end in "S" AND columns that end in "W". I've found ways to filter the columns based on a substring of column headers and ways to filter based on values in a certain column, but have not found anything that allows me to filter based on values > 0 that occur simultaneously in two sets of columns specified by different substrings.
Any help would be appreciated!
We can use filter_at
from the dplyr
package. dt2
is the final output.
# Load package
library(dplyr)
# Create example data frame
dt <- read.table(text = " OTU_ID X3_22L15_S X3_22T10_W X3_22L6_S X3_22Algae
1 denovo109 16 0 9 0
2 denovo147 44 484 28 0
3 denovo297 0 0 7 14
4 denovo1013 0 1 0 0",
header = TRUE, stringsAsFactors = FALSE)
# Filter the data, with any column ends with "S" or "W" and values > 0
dt2 <- dt %>%
filter_at(vars(ends_with("S")), any_vars(. > 0)) %>%
filter_at(vars(ends_with("W")), any_vars(. > 0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With