Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert single column data into multiple columns in python dataframe?

I have a data frame (df) with column col1 which has many rows and there are rows with a common string (Collection of numbers are) and ending with different numbers(001, 002, 005). I want to extract the rows between two string (Collection of numbers are 002 to Collection of numbers are 003) and assign them to new column with same row name (Collection of numbers are 002)

    col1
0   Collection of numbers are 002
1   53
2   20
3   56
4   Collection of numbers are 003
5   236
6   325
7   Collection of numbers are 005
8   96
9   23
10  63

I want to convert the how above data frame to the following format.

0   Collection of numbers are 002   Collection of numbers are 003   Collection of numbers are 005
1   53                              236                              96
2   20                              325                              23
3   56                                                               63

Note: No repeated numbers

like image 240
Raghu Avatar asked Apr 09 '26 16:04

Raghu


1 Answers

We could try ffill and some basic resphaping using str.split

df['headers'] = df['col1'].str.extract('(Collection.*)').ffill()


df1 = df[~df['col1'].str.contains('Collection')].copy()


df1.groupby('headers').agg(','.join)['col1'].str.split(',',expand=True).T.rename_axis('',axis='columns')

out:

  Collection of numbers are 002 Collection of numbers are 003  \
0                            53                           236   
1                            20                           325   
2                            56                          None   

  Collection of numbers are 005  
0                            96  
1                            23  
2                            63  
like image 63
Umar.H Avatar answered Apr 11 '26 05:04

Umar.H