How can I select cols using perl = TRUE
like regex.
data.frame(baa=0,boo=0,boa=0,lol=0,bAa=0) %>% dplyr::select(matches("(?i)b(?!a)"))
Error in grep(needle, haystack, ...) : invalid regular expression '(?i)b(?!a)', reason 'Invalid regexp'
regex is indeed valid.
grep("(?i)b(?!a)",c("baa","boo","boa","lol","bAa"),perl=T)
> [1] 2 3
Is there a shortcut function/way?
matches
in dplyr
does not support perl = TRUE
. However, you can make your own functions. After a bit of digging in the source code this works:
The fast way:
library(dplyr)
#notice the 3 colons because grep_vars is not exported from dplyr
matches2 <- function (match, ignore.case = TRUE, vars = current_vars())
{
dplyr:::grep_vars(match, vars, ignore.case = ignore.case, perl = TRUE)
}
data.frame(baa=0,boo=0,boa=0,lol=0,bAa=0) %>% select(matches2("(?i)b(?!a)"))
#boo boa
#1 0 0
Or a more explanatory solution:
matches2 <- function (match, ignore.case = TRUE, vars = current_vars())
{
grep_vars2(match, vars, ignore.case = ignore.case)
}
#this is pretty much my only change in the original dplyr:::grep_vars
#to make it accept perl.
grep_vars2 <- function (needle, haystack, ...)
{
grep(needle, haystack, perl = TRUE, ...)
}
data.frame(baa=0,boo=0,boa=0,lol=0,bAa=0) %>%
select(matches2("(?i)b(?!a)"))
#boo boa
#1 0 0
Another approach, although along the lines and probably more dangerous than LyzandeR's suggestion:
body(matches)[[grep("grep_vars", body(matches))]] <- substitute(grep_vars(match, vars, ignore.case = ignore.case, perl=T))
data.frame(baa=0,boo=0,boa=0,lol=0,bAa=0) %>% dplyr::select(matches("(?i)b(?!a)"))
boo boa
1 0 0
I would not use body(matches)[[3]]
as any updates would cause this little patch create problems.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With