I have a column in a dataframe where the values are letter-number combinations like G1, K8, A132, etc. I want to split the letter from the number but retain the number as a single number. I have been using strsplit but this gives a list of values as seen below where I would liek to have the output of G
and 10
:
x <- "G10"
strsplit(x, "")[[1]][1]
"G"
strsplit(x, "")[[1]][-1]
"1" "0"
this leads to the predictable downstream problems when I try to use the numbers as numbers. Here is a paste
example where I would like to get "somethingelse_10":
z <-strsplit(x, "")[[1]][-1]
paste("somethingelse",z, sep="_")
"somethingelse_1" "somethingelse_0"
Is there an easy way to split numbers from letters?
You can use gsub
to eliminate all non-digit, or all digit characters like so:
> x <- "A3"
> gsub("[^[:digit:]]","",x)
"3"
> gsub("[:digit:]","",x)
"A"
And then you can use as.numeric
to convert from string to number, if you desire.
The stringr package often has convenient functions for this sort of thing:
require(stringr)
str_extract(c("A1","B2","C123"),"[[:upper:]]")
#[1] "A" "B" "C"
str_extract(c("A1","B2","C123"),"[[:digit:]]+")
#[1] "1" "2" "123"
That assumes that each element has exactly one "letter" part, and one "number" part, since str_extract
is just pulling the first instance of a match.
If as your comment suggests you just have a single letter followed by one or more digits you could do something similar to this:
x <- c("G10", "X1231", "y14522")
# Just grab the first letter
letter <- substring(x, 1, 1)
letter
# [1] "G" "X" "y"
# Grab everything except the first character and convert to numeric
number <- as.numeric(substring(x, 2, nchar(x)))
number
#[1] 10 1231 14522
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With