Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract numbers inbetween characters in R

Tags:

regex

r

I have different character vector containing strings like "p.L86*", "p.A59fs*4", "p.E309*", etc. Each have different digits. I only want to extract the first a few numbers between the characters, so the expected solution would be 86, 59, 309.

I tried gsub("[^0-9]+","","p.A59fs*4"), but it will save all digits...

like image 707
Xu Jing Avatar asked Sep 27 '15 21:09

Xu Jing


1 Answers

You can use sub to get the first match results:

x <- c('p.L86*', 'p.A59fs*4', 'p.E309*')
sub('\\D*(\\d+).*', '\\1', x)
# [1] "86"  "59"  "309"

Or fallback to the stringi package and match them instead:

stri_extract_first_regex(x, '\\d+')
like image 66
hwnd Avatar answered Sep 28 '22 05:09

hwnd