split character data into numbers and letters

Tags:

r

I have a vector of character data. Most of the elements in the vector consist of one or more letters followed by one or more numbers. I wish to split each element in the vector into the character portion and the number portion. I found a similar question on Stackoverflow.com here:

split a character from a number with multiple digits

However, the answer given above does not seem to work completely in my case or I am doing something wrong. An example vector is below:

my.data <- c("aaa", "b11", "b21", "b101", "b111", "ccc1", "ddd1", "ccc20", "ddd13")  # I can obtain the number portion using: gsub("[^[:digit:]]", "", my.data)  # However, I cannot obtaining the character portion using: gsub("[:digit:]", "", my.data)

How can I obtain the character portion? I am using R version 2.14.1 on a Windows 7 64-bit machine.

734

asked Mar 18 '12 05:03

Mark Miller

2 Answers

Since none of the previous answers use tidyr::separate here it goes:

library(tidyr)  df <- data.frame(mycol = c("APPLE348744", "BANANA77845", "OATS2647892", "EGG98586456"))  df %>%   separate(mycol,             into = c("text", "num"),             sep = "(?<=[A-Za-z])(?=[0-9])"            )

answered Sep 26 '22 14:09

meriops

For your regex you have to use:

gsub("[[:digit:]]","",my.data)

The [:digit:] character class only makes sense inside a set of [].

answered Sep 22 '22 14:09

mathematical.coffee

Related questions
                            
                                How to rank within groups in R?
                            
                                Writing to specific schemas with RPostgreSQL
                            
                                What is the difference between NaN and Inf, and NULL and NA in R?
                            
                                Alternatives to nested ifelse statements in R
                            
                                Shaded area under two curves using R
                            
                                How can I use the row.names attribute to order the rows of my dataframe in R?
                            
                                Extract numeric part of strings of mixed numbers and characters in R
                            
                                Count leading zeros between the decimal point and first nonzero digit
                            
                                Are there any good R object browsers?
                            
                                How to set the row names of a data frame passed on with the pipe %>% operator?
                            
                                Warning in install.packages: unable to move temporary installation
                            
                                Get the number of lines in a text file using R
                            
                                R: how to total the number of NA in each col of data.frame
                            
                                ggplot x-axis labels with all x-axis values
                            
                                Plotting the average values for each level in ggplot2
                            
                                How to add boxplots to scatterplot with jitter
                            
                                Select rows with min value by group
                            
                                Unable to change Python path in reticulate
                            
                                Cannot allocate a new connection: 16 connections already opened RMySQL
                            
                                Count word occurrences in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With