Suppose I have some character vector, which I'd like to subset to elements that don't match some regular expression. I might use the -
operator to remove the subset that grep
matches:
> vec <- letters[1:5]
> vec
[1] "a" "b" "c" "d" "e"
> vec[-grep("d", vec)]
[1] "a" "b" "c" "e"
I'm given back everything except the entries that matched "d"
. But if I search for a regular expression that isn't found, instead of getting everything back as I would expect, I get nothing back:
> vec[-grep("z", vec)]
character(0)
Why does this happen?
It's because grep
returns an integer vector, and when there's no match, it returns integer(0)
.
> grep("d", vec)
[1] 4
> grep("z", vec)
integer(0)
and the since the -
operator works elementwise, and integer(0)
has no elements, the negation doesn't change the integer vector:
> -integer(0)
integer(0)
so vec[-grep("z", vec)]
evaluates to vec[-integer(0)]
which in turn evaluates to vec[integer(0)]
, which is character(0)
.
You will get the behavior you expect with invert = TRUE
:
> vec[grep("d", vec, invert = TRUE)]
[1] "a" "b" "c" "e"
> vec[grep("z", vec, invert = TRUE)]
[1] "a" "b" "c" "d" "e"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With