Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace all strings which do not include at least one letter

Tags:

regex

r

I want to delete strings which contains only special characters. If there is at least one letter or number the string should be left as is.

test_cases <- c("a", "&", "&& ", "& &", "& ", "&a", "& a", "1", "& 1", "&1", "& a d", "a ")
exp_out <- c("a", "", "", "", "", "&a", "& a", "1", "& 1", "&1", "& a d", "a ")

I used a negative look ahead for that:

gsub("^[^a-zA-Z0-9]+(?! *[a-zA-Z0-9]+ *)", "", test_cases, perl = TRUE)
# [1] "a"     ""      ""      ""      ""      "&a"    "& a"   "1"     "& 1"   "&1"    "& a d" "a "

This regex seems to be rather verbose and while testing I had to adapt it several times, because I forgot some edge cases. Thus, I was wondering whether I can come up with a "simpler" regex, that is a regex which is shorter?

like image 255
thothal Avatar asked Mar 24 '26 11:03

thothal


1 Answers

You may use

test_cases[!grepl("[[:alpha:][:digit:]]", test_cases)] <- ""

See the R demo

The !grepl("[[:alpha:][:digit:]]", test_cases) command will only fetch the items that do not contain any letter ([:alpha:]) or digit ([:digit:]).

Output

 [1] "a"     ""      ""      ""      ""      "&a"    "& a"   "1"     "& 1"  
[10] "&1"    "& a d" "a "
like image 129
Wiktor Stribiżew Avatar answered Mar 25 '26 23:03

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!