Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas split column into multiple columns by comma

I am trying to split a column into multiple columns based on comma/space separation.

My dataframe currently looks like

     KEYS                                                  1 0   FIT-4270                                          4000.0439 1   FIT-4269                                          4000.0420, 4000.0471 2   FIT-4268                                          4000.0419 3   FIT-4266                                          4000.0499 4   FIT-4265                                          4000.0490, 4000.0499, 4000.0500, 4000.0504, 

I would like

   KEYS                                                  1           2            3        4  0   FIT-4270                                          4000.0439 1   FIT-4269                                          4000.0420  4000.0471 2   FIT-4268                                          4000.0419 3   FIT-4266                                          4000.0499 4   FIT-4265                                          4000.0490  4000.0499  4000.0500  4000.0504  

My code currently removes The KEYS column and I'm not sure why. Could anyone improve or help fix the issue?

v = dfcleancsv[1]  #splits the columns by spaces into new columns but removes KEYS?  dfcleancsv = dfcleancsv[1].str.split(' ').apply(Series, 1) 
like image 733
Anekdotin Avatar asked Jun 02 '16 19:06

Anekdotin


People also ask

How do I split one column into multiple columns in pandas?

We can use str. split() to split one column to multiple columns by specifying expand=True option. We can use str. extract() to exract multiple columns using regex expression in which multiple capturing groups are defined.

How do I split a single column into multiple columns in Python?

We can use the pandas Series. str. split() function to break up strings in multiple columns around a given separator or delimiter. It's similar to the Python string split() method but applies to the entire Dataframe column.

How do I split a column into multiple rows in pandas?

To split text in a column into multiple rows with Python Pandas, we can use the str. split method. to create the df data frame. Then we call str.

How do you break a comma separated string in a pandas column?

Since you have a list of comma separated strings, split the string on comma to get a list of elements, then call explode on that column.


2 Answers

In case someone else wants to split a single column (deliminated by a value) into multiple columns - try this:

series.str.split(',', expand=True) 

This answered the question I came here looking for.

Credit to EdChum's code that includes adding the split columns back to the dataframe.

pd.concat([df[[0]], df[1].str.split(', ', expand=True)], axis=1) 

Note: The first argument df[[0]] is DataFrame.

The second argument df[1].str.split is the series that you want to split.

split Documentation

concat Documentation

like image 190
Anthony R Avatar answered Sep 19 '22 01:09

Anthony R


Using Edchums answer of

pd.concat([df[[0]], df[1].str.split(', ', expand=True)], axis=1) 

I was able to solve it by substituting my variables.

dfcleancsv = pd.concat([dfcleancsv['KEYS'], dfcleancsv[1].str.split(', ', expand=True)], axis=1) 
like image 38
Anekdotin Avatar answered Sep 17 '22 01:09

Anekdotin