Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: how to find the first digit in a string

Tags:

regex

r

    string = "ABC3JFD456"

Suppose I have the above string, and I wish to find what the first digit in the string is and store its value. In this case, I would want to store the value 3 (since it's the first-occuring digit in the string). grepl("\\d", string) only returns a logical value, but does not tell me anything about where or what the first digit is. Which regular expression should I use to find the value of the first digit?

like image 672
Adrian Avatar asked Dec 01 '14 22:12

Adrian


People also ask

How do I find the first number in a string?

To get the first number in a string:Use the search() method to get the index of the first number in the string. The search method takes a regular expression and returns the index of the first match in the string.

How do you take the first part of a string in R?

In order to extract the first n characters with the substr command, we needed to specify three values within the function: The character string (in our case x). The first character we want to keep (in our case 1). The last character we want to keep (in this specific example we extracted the first 3 values).

How do I find part of a string in R?

Find substring in R using substr() method in R Programming is used to find the sub-string from starting index to the ending index values in a string. Return: Returns the sub string from a given string using indexes.

How do I find a character in a string in R?

In R, we use the grepl() function to check if characters are present in a string or not. And the method returns a Boolean value, TRUE - if the specified sequence of characters are present in the string. FALSE - if the specified sequence of characters are not present in the string.


2 Answers

Base R

regmatches(string, regexpr("\\d", string))
## [1] "3"

Or using stringi

library(stringi)
stri_extract_first(string, regex = "\\d")
## [1] "3"

Or using stringr

library(stringr)
str_extract(string, "\\d")
## [1] "3"
like image 71
David Arenburg Avatar answered Oct 09 '22 01:10

David Arenburg


1) sub Try sub with the indicated regular expression which takes the shortest string until a digit, a digit and then everything following and replaces it with the digit:

sub(".*?(\\d).*", "\\1", string)

giving:

[1] "3"

This also works if string is a vector of strings.

2) strapplyc It would also be possible to use strapplyc from gsubfn in which case an even simpler regular expression could be used:

strapplyc(string, "\\d", simplify = TRUE)[1]

giving the same or use this which gives the same answer again but also works if string is a vector of strings:

sapply(strapplyc(string, "\\d"), "[[", 1)
like image 27
G. Grothendieck Avatar answered Oct 09 '22 00:10

G. Grothendieck