I am going through strings of data for instagram usernames, I have been able to use regex to remove almost all unnecessary characters. I can't figure out how to remove the " 's " trailing the words.
I am able to remove every other special character with regex. I either can remove the apostrophe and not the s, or just skip over it entirely.
[1] "@kyrieirving’s" "@jaytatum0"
> follower.list <- gsub("[^[:alnum:][:blank:]@_]", "", follower.list)
[1] "@kyrieirvings" "@jaytatum0"
[1] "@kyrieirving" "@jaytatum0"
Use
['’]s\b|[^[:alnum:][:blank:]@_]
See the regex demo.
Details
['’]s\b
- '
or ’
and then s
at the end of a word|
- or[^[:alnum:][:blank:]@_]
- any char but an alphanumeric, horizontal whitespace, @
or _
charR demo:
> x <- c("@kyrieirving’s", "@jaytatum0")
> gsub("['’]s\\b|[^[:alnum:][:blank:]@_]", "",x)
[1] "@kyrieirving" "@jaytatum0"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With