I have a sentence and a query as on the image. It works as I want. But I really stuck while porting it to R.

My R query is:
gsub("\\S*[^[:alnum:]\\s\\?\\!\",();:\\.'\\/-]+\\S*", "", x)
and it cuts everything. I can't find my error.
Even the shorter with alnum: "\\S*[^[:alnum:]]+\\S*" cuts everything.
I don't understand. Please help.
You cannot use \s shorthand class in the TRE bracket expression, replace with [:space:], and unescape all the other "special" chars because you should not escape them either (they already match literal symbols).
pat <- "\\S*[^[:alnum:][:space:]?!\",();:.'/-]+\\S*"
x <- "But what's about in a interacting QFT a 2-particla state in the far past: $|E_{\\bf p_1}, {\\bf p_1}, E_{\\bf p_2} {\\bf p_2}>$ which undergoes"
gsub(pat, "", x)
Note that even gsub(pat, "", x, perl=TRUE) will also work.
See the R demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With