I have a unique character, each letter follows a number. For instance: A1B10C5
I would like to split it into letter <- c(A, B, C)
and number <- c(1, 10, 5)
using R.
We can use regex lookarounds to split between the letters and numbers
v1 <- strsplit(str1, "(?<=[A-Za-z])(?=[0-9])|(?<=[0-9])(?=[A-Za-z])", perl = TRUE)[[1]]
v1[c(TRUE, FALSE)]
#[1] "A" "B" "C"
as.numeric(v1[c(FALSE, TRUE)])
#[1] 1 10 5
str1 <- "A1B10C5"
str_extract_all
is another way to do this:
library(stringr)
> str <- "A1B10C5"
> str
[1] "A1B10C5"
> str_extract_all(str, "[0-9]+")
[[1]]
[1] "1" "10" "5"
> str_extract_all(str, "[aA-zZ]+")
[[1]]
[1] "A" "B" "C"
To extract letters and numbers at same time, you can use str_match_all
to get letters and numbers in two separate columns:
library(stringr)
str_match_all("A1B10C5", "([a-zA-Z]+)([0-9]+)")[[1]][,-1]
# [,1] [,2]
#[1,] "A" "1"
#[2,] "B" "10"
#[3,] "C" "5"
You can also use the base R regmatches
with gregexpr
:
regmatches(this, gregexpr("[0-9]+", "A1B10C5"))
[[1]]
[1] "1" "10" "5"
regmatches(this, gregexpr("[A-Z]+", "A1B10C5"))
[[1]]
[1] "A" "B" "C"
These return lists with a single element, a character vector. As akrun does, you can extract the list item using [[1]]
and can also convert the vector of digits to numeric like this:
as.numeric(regmatches(this, gregexpr("[0-9]+", this))[[1]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With