Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove " 's" in a string?

Tags:

string

regex

r

I am going through strings of data for instagram usernames, I have been able to use regex to remove almost all unnecessary characters. I can't figure out how to remove the " 's " trailing the words.

I am able to remove every other special character with regex. I either can remove the apostrophe and not the s, or just skip over it entirely.

follower.list

[1] "@kyrieirving’s" "@jaytatum0"    
> follower.list <- gsub("[^[:alnum:][:blank:]@_]", "", follower.list)

follower.list

[1] "@kyrieirvings" "@jaytatum0"   

Expected:

[1] "@kyrieirving" "@jaytatum0"
like image 752
Vincent Cortese Avatar asked Jul 13 '19 17:07

Vincent Cortese


1 Answers

Use

['’]s\b|[^[:alnum:][:blank:]@_]

See the regex demo.

Details

  • ['’]s\b - ' or and then s at the end of a word
  • | - or
  • [^[:alnum:][:blank:]@_] - any char but an alphanumeric, horizontal whitespace, @ or _ char

R demo:

> x <- c("@kyrieirving’s", "@jaytatum0")
> gsub("['’]s\\b|[^[:alnum:][:blank:]@_]", "",x)
[1] "@kyrieirving" "@jaytatum0" 
like image 154
Wiktor Stribiżew Avatar answered Sep 22 '22 01:09

Wiktor Stribiżew