I want to delete strings which contains only special characters. If there is at least one letter or number the string should be left as is.
test_cases <- c("a", "&", "&& ", "& &", "& ", "&a", "& a", "1", "& 1", "&1", "& a d", "a ")
exp_out <- c("a", "", "", "", "", "&a", "& a", "1", "& 1", "&1", "& a d", "a ")
I used a negative look ahead for that:
gsub("^[^a-zA-Z0-9]+(?! *[a-zA-Z0-9]+ *)", "", test_cases, perl = TRUE)
# [1] "a" "" "" "" "" "&a" "& a" "1" "& 1" "&1" "& a d" "a "
This regex seems to be rather verbose and while testing I had to adapt it several times, because I forgot some edge cases. Thus, I was wondering whether I can come up with a "simpler" regex, that is a regex which is shorter?
You may use
test_cases[!grepl("[[:alpha:][:digit:]]", test_cases)] <- ""
See the R demo
The !grepl("[[:alpha:][:digit:]]", test_cases) command will only fetch the items that do not contain any letter ([:alpha:]) or digit ([:digit:]).
Output
[1] "a" "" "" "" "" "&a" "& a" "1" "& 1"
[10] "&1" "& a d" "a "
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With