Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get words that end with certain characters within each string r

Tags:

string

regex

r

gsub

I have a vector of strings that looks like:

str <- c("bills slashed for poor families today", "your calls are charged", "complaints dept awaiting refund")

I want to get all the words that end with the letter s and remove the s. I have tried:

gsub("s$","",str)

but it doesn't work because it tries to match with the strings that end with s instead of words. I'm trying to get an output that looks like:

[1] bill slashed for poor familie today
[2] your call are charged
[3] complaint dept awaiting refund

Any pointers as to how I can do this? Thanks

like image 465
Tavi Avatar asked Aug 25 '14 11:08

Tavi


People also ask

How would you extract one particular word from a string in R?

To extract words from a string vector, we can use word function of stringr package. For example, if we have a vector called x that contains 100 words then first 20 words can be extracted by using the command word(x,start=1,end=20,sep=fixed(" ")).

How do I extract part of text in R?

The str_sub() function in stringr extracts parts of strings based on their location. As with all stringr functions, the first argument, string , is a vector of strings. The arguments start and end specify the boundaries of the piece to extract in characters.

How do I extract a string between two patterns in R?

While dealing with text data, we sometimes need to extract values between two words. These words can be close to each other, at the end sides or on random sides. If we want to extract the strings between two words then str_extract_all function of stringr package can be used.

How do I cut a character from a string in R?

How to remove a character or multiple characters from a string in R? You can either use R base function gsub() or use str_replace() from stringr package to remove characters from a string or text.


2 Answers

$ checks for the end of the string, not the end of a word.

To check for the word boundaries you should use \b

So:

gsub("s\\b", "", str)
like image 125
nico Avatar answered Sep 27 '22 22:09

nico


Here's a non base R solution:

library(rebus)
library(stringr)

plurals <- "s" %R% BOUNDARY
str_replace_all(str, pattern = plurals, replacement = "")
like image 34
epo3 Avatar answered Sep 27 '22 23:09

epo3