Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

split character data into numbers and letters

Tags:

r

I have a vector of character data. Most of the elements in the vector consist of one or more letters followed by one or more numbers. I wish to split each element in the vector into the character portion and the number portion. I found a similar question on Stackoverflow.com here:

split a character from a number with multiple digits

However, the answer given above does not seem to work completely in my case or I am doing something wrong. An example vector is below:

my.data <- c("aaa", "b11", "b21", "b101", "b111", "ccc1", "ddd1", "ccc20", "ddd13")  # I can obtain the number portion using: gsub("[^[:digit:]]", "", my.data)  # However, I cannot obtaining the character portion using: gsub("[:digit:]", "", my.data) 

How can I obtain the character portion? I am using R version 2.14.1 on a Windows 7 64-bit machine.

like image 734
Mark Miller Avatar asked Mar 18 '12 05:03

Mark Miller


People also ask

How do you split a list between letters and digits in Python?

The split() method of the string class is fairly straightforward. It splits the string, given a delimiter, and returns a list consisting of the elements split out from the string. By default, the delimiter is set to a whitespace - so if you omit the delimiter argument, your string will be split on each whitespace.

How do you separate a character and integer from a string in Python?

Method #1 : Using re. compile() + re. match() + re. groups() The combination of all the above regex functions can be used to perform this particular task.


2 Answers

Since none of the previous answers use tidyr::separate here it goes:

library(tidyr)  df <- data.frame(mycol = c("APPLE348744", "BANANA77845", "OATS2647892", "EGG98586456"))  df %>%   separate(mycol,             into = c("text", "num"),             sep = "(?<=[A-Za-z])(?=[0-9])"            ) 
like image 84
meriops Avatar answered Sep 26 '22 14:09

meriops


For your regex you have to use:

gsub("[[:digit:]]","",my.data) 

The [:digit:] character class only makes sense inside a set of [].

like image 36
mathematical.coffee Avatar answered Sep 22 '22 14:09

mathematical.coffee