Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract string between /

Tags:

regex

r

If I have these strings:

mystrings <- c("X2/D2/F4",
               "X10/D9/F4",
               "X3/D22/F4",
               "X9/D22/F9")

How can I extract 2,9,22,22. These characters are between the / and after the first character within the /.

I would like to do this in a vectorized fashion and add the new column with transfrom if possible with which I am familiar.

I think this regex gets me somewhere near all the characters within \:

^.*\\'(.*)'\\.*$
like image 459
user1320502 Avatar asked Jan 03 '13 20:01

user1320502


People also ask

How do I extract a string between two characters?

Extract part string between two different characters with formulas. To extract part string between two different characters, you can do as this: Select a cell which you will place the result, type this formula =MID(LEFT(A1,FIND(">",A1)-1),FIND("<",A1)+1,LEN(A1)), and press Enter key.

How do I extract a string between two words in Python?

To find a string between two strings in Python, use the re.search() method. The re.search() is a built-in Python method that searches a string for a match and returns the Match object if it finds a match. If it finds more than one match, it only returns the first occurrence of the match.

How do you extract a certain part of a string?

The substr() method extracts a part of a string. The substr() method begins at a specified position, and returns a specified number of characters. The substr() method does not change the original string. To extract characters from the end of the string, use a negative start position.


1 Answers

> gsub("(^.+/[A-Z]+)(\\d+)(/.+$)", "\\2", mystrings)
[1] "2"  "9"  "22" "22"

You would "read" (or "parse") that regex pattern as splitting any matched string into three parts:

1) anything up to and including the first forward slash followed by a sequence of capital letters,

2) any digits(= "\d") in a sequence before the next slash and ,

3) from the next slash to the end.

And then only returning the second part....

Non-matched character strings would be returned unaltered.

like image 76
IRTFM Avatar answered Oct 14 '22 13:10

IRTFM