I have two data frames (df1 and df2) in R with different information except for two columns, a person number (pnr) and a drug name (name). Row for row in df1, I want to check, if the combination of pnr and name exists somewhere in df2. If this combination exists, I want "yes" in a another column in df1. If not a "no".
df1
pnr|drug|...|check
---|----|---|-----
1 | 1 |...| no
1 | 2 |...| yes
2 | 2 |...| yes
3 | 2 |...| no
.....
df2
pnr|drug|...|
---|----|---|
1 | 2 |...|
2 | 2 |...|
....
For example, I want check, if the row combination pnr=1 & drug=1 exists in df2 (no), pnr=1 & drug=2 (yes) etc. And then place a "yes" or "no" in the check column in df1
I have tried the following for
statement without luck. It does place a "yes or "no" in the "check" column, but it doesn't do it correctly
for(index in 1:nrow(df1)){
if((df1[index,]$pnr %in% df2$pnr)&(df1[index,]$name %in% df2$name)){
check_text="yes"}else{check_text="no"}
df1$check=check_text
}
I have a sense that I should be using apply
, but I haven't been able to figure that out. Do any of you have an idea how to solve this?
One way is using base R
methods.
Pasting the columns pnr
and drug
together and finding a similar match
in df1
df1$check <- ifelse(is.na(match(paste0(df1$pnr, df1$drug),
paste0(df2$pnr, df2$drug))),"No", "Yes")
# pnr drug check
#1 1 1 No
#2 1 2 Yes
#3 2 2 Yes
#4 3 2 No
This is natural for dplyr::left_join
:
library(dplyr) # for left_join, transmute
library(tidyr) # for replace_na
df1 <- expand.grid(pnr = 1:3, drug = 1:3)
df2 <- data.frame(pnr = c(1, 3), drug = c(2, 1))
df1 <- df1 %>%
left_join(df2 %>% transmute(pnr, drug, check = 'yes')) %>%
replace_na(list(check = 'no'))
df1
#> pnr drug check
#> 1 1 1 no
#> 2 2 1 no
#> 3 3 1 yes
#> 4 1 2 yes
#> 5 2 2 no
#> 6 3 2 no
#> 7 1 3 no
#> 8 2 3 no
#> 9 3 3 no
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With