Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split columns by number in a dataframe

I'm trying to separate a column in a rather untidy dataframe.

section
View 500
V458
453

And I want to create a new column from this. With the preferred output like below.

section  section numbers  
View     500
V        458
         453

I've been trying to research it but I'm having a time with it. I can separate them in the case of the first row, because I can use regex like this.

df_split <- separate(df, col = section, into = c("section", "section_number"), sep = " +[1-9]")

But I can't seem to find a way to use an "or" type statement. If anyone has any input that would be wonderful.

like image 956
sevpants Avatar asked Dec 23 '16 20:12

sevpants


People also ask

How do you divide a column by a number in a DataFrame?

The second method to divide two columns is using the div() method. It divides the columns elementwise. It accepts a scalar value, series, or dataframe as an argument for dividing with the axis. If the axis is 0 the division is done row-wise and if the axis is 1 then division is done column-wise.

How do you split a DataFrame in Python by number?

div() method divides element-wise division of one pandas DataFrame by another. DataFrame elements can be divided by a pandas series or by a Python sequence as well. Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).

How do you split columns in a dataset?

Select the "Sales Rep" column, and then select Home > Transform > Split Column.


2 Answers

Using a simple gsub would be a choice for me:

section <- c('View 500', 'V458', '453')

cbind(section = trimws(gsub('[0-9]', '', section)), 
      section_numbers = trimws(gsub('[a-zA-Z]', '', section)))

I use trimws to just remove any unwanted white spaces.

Output:

    section section_numbers
[1,] "View"  "500"          
[2,] "V"     "458"          
[3,] ""      "453" 
like image 180
LyzandeR Avatar answered Sep 28 '22 09:09

LyzandeR


You can use tidyr for this:

tidyr::extract(df,section, c("section", "section number"), 
               regex="([[:alpha:]]*)[[:space:]]*([[:digit:]]*)")
  section section number
1    View            500
2       V            458
3                    453
like image 29
HubertL Avatar answered Sep 28 '22 07:09

HubertL