I have a data set like this:
PatientNum<- c(1, 2, 3, 4, 5)
Age<- c(10, 21, 51, 42, 35)
SurgicalProcedureNumber <- c(21356, 21424, 221356, 12345, 54321)
OtherSurgicalProcedureNumber <- c(54321, 21356, 12345, 12345, 21424)
BloodLoss<-c(5,4,5,10,5)
YetAnotherSurgicalProcedureNumber<-c(11111,22222,33333,21356,555555)
dataset <- data.frame(PatientNum, Age, SurgicalProcedureNumber, OtherSurgicalProcedureNumber, BloodLoss,YetAnotherSurgicalProcedureNumber)
I want to know when the patients underwent certain surgeries:
NumbersIcareAbout<- c(21356,21424)
I know to find them in one column I could do:
dataset%>%filter(SurgicalProcedureNumber%in%NumbersIcareAbout)
And that would return patient 1 and 2 because it has those codes... but I want it to return the row anytime those numbers shows up in SurgicalProcedureNumber,OtherSurgicalProcedureNumber, or YetAnotherSurgicalProcedureNumber
No idea how and couldn't find it on here.
In dplyr
version 1.0.4, we can use filter
with if_any
library(dplyr)
dataset %>%
filter(if_any(ends_with("ProcedureNumber"), ~ . %in% NumbersIcareAbout))
You can use the OR operator |
like this:
dataset %>%
filter(SurgicalProcedureNumber %in% NumbersIcareAbout |
OtherSurgicalProcedureNumber %in% NumbersIcareAbout |
YetAnotherSurgicalProcedureNumber %in% NumbersIcareAbout)
# PatientNum Age SurgicalProcedureNumber OtherSurgicalProcedureNumber BloodLoss YetAnotherSurgicalProcedureNumber
# 1 1 10 21356 54321 5 11111
# 2 2 21 21424 21356 4 22222
# 3 4 42 12345 12345 10 21356
# 4 5 35 54321 21424 5 555555
I'm sure there's a more elegant way, but this filters to those with matches in 2 or more columns (I think):
dataset %>%
filter((SurgicalProcedureNumber %in% NumbersIcareAbout +
OtherSurgicalProcedureNumber %in% NumbersIcareAbout +
YetAnotherSurgicalProcedureNumber %in% NumbersIcareAbout) >= 2)
# PatientNum Age SurgicalProcedureNumber OtherSurgicalProcedureNumber BloodLoss YetAnotherSurgicalProcedureNumber
# 1 2 21 21424 21356 4 22222
It works by summing together the logical vectors that result from your comparisons using %in%
. It relies on the fact that TRUE + TRUE
gives 2
. So, if a row has a value of 2 or greater then it has multiple matches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With