Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R sub extract everything before last occurence of a character

Tags:

r

I have the following vector c:

ABC-XXX
DEF-4-YYY

I want to extract everything before the last occurence of '-', meaning that I would keep this

ABC
DEF-4

I've tried the following:

sub([-].*, '', "DEF-4-YYY")

But this replaces everything after the first '-', while it should look for the last '-'. Output of the above command is:

"DEF"

What is wrong here?

like image 750
user1987607 Avatar asked Jan 22 '18 14:01

user1987607


People also ask

How do I extract part of a character in R?

The substring function in R can be used either to extract parts of character strings, or to change the values of parts of character strings. substring of a vector or column in R can be extracted using substr() function. To extract the substring of the column in R we use functions like substr() and substring().

How do I extract a specific word from a string in R?

To extract words from a string vector, we can use word function of stringr package. For example, if we have a vector called x that contains 100 words then first 20 words can be extracted by using the command word(x,start=1,end=20,sep=fixed(" ")).

How do I remove the last character of a string in R?

To remove the string's last character, we can use the built-in substring() function in R. The substring() function accepts 3 arguments, the first one is a string, the second is start position, third is end position.


Video Answer


1 Answers

We can do with sub by matching a - followed by zero or more characters that are not a - till the end ($) of the string and replace it with blank ('')

sub('-[^-]*$', '', v1)
#[1] "ABC"   "DEF-4"

data

v1 <- c('ABC-XXX', 'DEF-4-YYY')
like image 57
akrun Avatar answered Sep 18 '22 11:09

akrun