I wish to remove rows of my data frame that contain a specific pattern and I wish to use tidyverse syntax if possible.
I wish to remove rows where column 1 contains "cat" and where any of col2:4 contain any of the following words: dog, fox or cow. For this example that will remove rows 1 and 4 from the original data.
Here's a sample dataset:
df <- data.frame(col1 = c("cat", "fox", "dog", "cat", "pig"),
col2 = c("lion", "tiger", "elephant", "dog", "cow"),
col3 = c("bird", "cow", "sheep", "fox", "dog"),
col4 = c("dog", "cat", "cat", "cow", "fox"))
I've tried a number of across variants but constantly run into issues. Here is my latest attempt:
filtered_df <- df %>%
filter(!(animal1 == "cat" & !any(cowfoxdog <- across(animal2:animal4, ~ . %in% c("cow", "fox", "dog")))))
This returns the following error:
Error in `filter()`:
! Problem while computing `..1 = !...`.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric variables
You can use if_any(). For a more robust test, I first added a row where col1 == "cat" but "dog", "fox", or "cow" don't appear in columns 2-4.
library(dplyr)
df <- df %>%
add_row(col1 = "cat", col2 = "sheep", col3 = "lion", col4 = "tiger")
df %>%
filter(!(col1 == "cat" & if_any(col2:col4, \(x) x %in% c("dog", "fox", "cow"))))
col1 col2 col3 col4
1 fox tiger cow cat
2 dog elephant sheep cat
3 pig cow dog fox
4 cat sheep lion tiger
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With