Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract numeric part of strings of mixed numbers and characters in R

Tags:

string

r

I have a lot of strings, and each of which tends to have the following format: Ab_Cd-001234.txt I want to replace it with 001234. How can I achieve it in R?

like image 927
user288609 Avatar asked Mar 16 '13 15:03

user288609


People also ask

How do I extract a number from a character string in R?

In this method to extract numbers from character string vector, the user has to call the gsub() function which is one of the inbuilt function of R language, and pass the pattern for the first occurrence of the number in the given strings and the vector of the string as the parameter of this function and in return, this ...

How do I extract digits from a string?

The following example shows how you can use the replaceAll() method to extract all digits from a string in Java: // string contains numbers String str = "The price of the book is $49"; // extract digits only from strings String numberOnly = str. replaceAll("[^0-9]", ""); // print the digitts System. out.


1 Answers

The stringr package has lots of handy shortcuts for this kind of work:

# input data following @agstudy data <-  c('Ab_Cd-001234.txt','Ab_Cd-001234.txt')  # load library library(stringr)  # prepare regular expression regexp <- "[[:digit:]]+"  # process string str_extract(data, regexp)  Which gives the desired result:    [1] "001234" "001234" 

To explain the regexp a little:

[[:digit:]] is any number 0 to 9

+ means the preceding item (in this case, a digit) will be matched one or more times

This page is also very useful for this kind of string processing: http://en.wikibooks.org/wiki/R_Programming/Text_Processing

like image 121
Ben Avatar answered Sep 22 '22 23:09

Ben