Why in the code below dplyr's filter
doesn't return the same data.frame as base R subsetting?
In fact none of them works as expected. I'd like to remove observations/rows which, simultaneously, b==1 AND c==1
. That is, I'd like to remove only the third row.
require(dplyr)
df <- data.frame(a=c(0,0,0,0,1,1,1),
b=c(0,0,1,1,0,0,1),
c=c(1,NA,1,NA,1,NA,NA))
filter(df, !(b==1 & c==1))
df[!(df$b==1 & df$c==1),]
Or use complete.cases
to convert NA
to FALSE
in the result logic vector so that you can pick the corresponding rows up after the negation, and this uses the fact that NA & F = F
:
filter(df, !(b == 1 & c == 1 & complete.cases(df[c('b', 'c')])))
# a b c
# 1 0 0 1
# 2 0 0 NA
# 3 0 1 NA
# 4 1 0 1
# 5 1 0 NA
# 6 1 1 NA
More logical operations with NA
involved here, which is a little bit confusing at the first glance but they are following the logic:
NA & F
# [1] FALSE
NA | T
# [1] TRUE
NA & T
# [1] NA
NA | F
# [1] NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With