I am trying to separate numbers and characters in a column of strings. So far I have been using <code>tidyr::separate</code> for doing this, but am encountering errors for "unusual" cases. Suppose I have the following data <pre class="prettyprint"><code>df <- data.frame(c1 = c("5.5K", "2M", "3.1", "M")) </code></pre> And I want to obtain a data frame with columns <pre class="prettyprint"><code>data.frame(c2 = c("5.5", "2", "3.1", NA), c3 = c("K", "M", NA, "M)) </code></pre> So far I have been using <code>tidyr::separate</code> <pre class="prettyprint"><code>df %>% separate(c1, into =c("c2", "c3"), sep = "(?<=[0-9])(?=[A-Za-z])") </code></pre> But this only works for the first three cases. I realize this is because <code>?<=...</code> and <code>?=...</code> require the presence of the regex. How would one modify this code to capture the cases where the numbers are missing before the letters? Been trying to use the <code>extract</code> function too, but without success. Edit: I suppose one solution is to break this up into <pre class="prettyprint"><code>df$col2 <- as.numeric(str_extract(df$col1, "[0-9]+")) df$col3 <- (str_extract(df$col1, "[aA-zZ]+")) </code></pre> But I was curious whether were other ways to handle it.

<pre class="prettyprint"><code>extract(df, c1, into =c("c2", "c3"), "([\\.\\d]*)([a-zA-Z]*)") # c2 c3 # 1 5.5 K # 2 2 M # 3 3.1 # 4 M </code></pre> You can use <code>seperate</code> simply in this way, but there should be a more elegant method.. <pre class="prettyprint"><code>df %>% separate(c1, into =c("c2", "c3"), sep = "(?=[A-Za-z])") # c2 c3 # 1 5.5 K # 2 2 M # 3 3.1 <NA> # 4 M </code></pre>

Splitting strings into number and string (with missings)

Q: How do you separate text and numbers in Python?

Use the re. split() method to split a string into text and number, e.g. my_list = re. split(r'(\d+)', my_str) .

Q: How do I split a string into string?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.

Q: How do I split a string into multiple substrings?

split() The method split() splits a String into multiple Strings given the delimiter that separates them. The returned object is an array which contains the split Strings. We can also pass a limit to the number of elements in the returned array.

Tags:

string

regex

r

tidyverse

I am trying to separate numbers and characters in a column of strings. So far I have been using tidyr::separate for doing this, but am encountering errors for "unusual" cases.

Suppose I have the following data

df <- data.frame(c1 = c("5.5K", "2M", "3.1", "M"))

And I want to obtain a data frame with columns

data.frame(c2 = c("5.5", "2", "3.1", NA),
c3 = c("K", "M", NA, "M))

So far I have been using tidyr::separate

df %>%
separate(c1, into =c("c2", "c3"), sep = "(?<=[0-9])(?=[A-Za-z])")

But this only works for the first three cases. I realize this is because ?<=... and ?=... require the presence of the regex. How would one modify this code to capture the cases where the numbers are missing before the letters? Been trying to use the extract function too, but without success.

Edit: I suppose one solution is to break this up into

df$col2 <- as.numeric(str_extract(df$col1, "[0-9]+"))
df$col3 <- (str_extract(df$col1, "[aA-zZ]+"))

But I was curious whether were other ways to handle it.

449

asked Apr 16 '19 03:04

user11151932

1 Answers

extract(df, c1, into =c("c2", "c3"), "([\\.\\d]*)([a-zA-Z]*)")
#    c2 c3
# 1 5.5  K
# 2   2  M
# 3 3.1   
# 4      M

You can use seperate simply in this way, but there should be a more elegant method..

df %>% separate(c1, into =c("c2", "c3"), sep = "(?=[A-Za-z])")
#    c2   c3
# 1 5.5    K
# 2   2    M
# 3 3.1 <NA>
# 4        M

121

answered Oct 13 '22 20:10

VicaYang

Related questions
                            
                                RStudio global settings (options) Export/Import
                            
                                Defining and implementing interfaces in R
                            
                                Borders and colors on world map ggplot2
                            
                                R pie charts distorted when adding to projected map using ggplot
                            
                                Get name of a functions inside a list
                            
                                Font family won't change in ggplot [duplicate]
                            
                                Facet function in highcharts
                            
                                Valgrind/R is not working: "Fatal error: cannot create 'R_TempDir'"
                            
                                rmarkdown::render problem when called from a package
                            
                                Get PID for subprocesses for asynchronous futures in R shiny
                            
                                Change axis text direction to right-to-left
                            
                                CRAN-acceptable way of linking to OpenMP some C code called from Rcpp
                            
                                Add a particular version of R to a docker container
                            
                                image as axis tick ggplot
                            
                                How to find the joint cumulative distribution function from a 2-D copula in R?
                            
                                Discrepancies between R optim vs Scipy optimize: Nelder-Mead
                            
                                Transparent lookup table for numeric values without using data.frame?
                            
                                Conditionally include chapters in Bookdown
                            
                                Adding a horizontal line to a plotly bar graph
                            
                                How to update the leaflet map in the selectModUI in a Shiny app?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With