Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract the last digits of strings using regular expressions?

I have a bunch of colnames

L_1_3
L_2_23
L_3_91
L_3_16

I want to replace these colnames with new names using the last digits following the _ like this:

3
23
91
16

I've tried colnames(X) <- gsub("L_\\d\\d_", "", colnames(X)) which works for strings with double digits at the end. I want one that works for both single and double digits.

Thank you!

like image 368
Drew Avatar asked Jun 02 '20 18:06

Drew


2 Answers

Here's an option with positive lookahead:

gsub(".+_(?=\\d+$)", "", X, perl = TRUE)
[1] "3"  "23" "91" "16"
like image 180
Ian Campbell Avatar answered Sep 18 '22 14:09

Ian Campbell


If that is the pattern that works for you for 2 digits, the only thing you would have to do is to make one of the digits optional using ?

L_\\d\\d?_

Regex demo | R demo


If you must match the whole pattern, you could use a capturing group and use anchors to assert the start ^ and the end $ of the string and use the group in the replacement.

^L_\\d\\d?_(\\d+)$

In parts

^      Start of string
L_     Match L_
\d     Match a digit
\d?    Match a digit and repeat 0 or 1 times
_      Match _
(      Capture group 1
  \d+  Match a digit and repeat 1 or more times
)      Close group
$      End of string

Regex demo | R demo

X <- c("L_1_3", "L_2_23", "L_3_91", "L_3_16")
gsub("^L_\\d\\d?_(\\d+)$", "\\1", X)

Output

[1] "3"  "23" "91" "16"
like image 29
The fourth bird Avatar answered Sep 18 '22 14:09

The fourth bird