Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to transpose a column into multiple rows for a column value in python? [closed]

If I have a dataframe and it has col1 which is file name, col2 is values that I want transpose into file name with its values for example:

Input:
col1  col2
file1 text_0
file1 text_1
file1 text_2
file2 text_0
file2 text_1
file2 text_2
file2 text_3
file3 text_0

Output:
col1  col2   col3   col4   col5
file1 text_0 text_1 text_2 
file2 text_0 text_1 text_2 text_3
file3 text_0 

1 Answers

First idea is use GroupBy.cumcount for counter of duplicated values of col1 for new columns names and reshape by Series.unstack:

df = (df.set_index(['col1',df.groupby('col1').cumcount()])['col2']
        .unstack(fill_value='')
        .reset_index())
df.columns = [f'col{x}' for x in range(1, len(df.columns) + 1)]
print (df)
    col1    col2    col3    col4    col5
0  file1  text_0  text_1  text_2        
1  file2  text_0  text_1  text_2  text_3
2  file3  text_0                        

Or create Series of list and avoid use apply(pd.Series), because slow, better is use DataFrame constructor:

s = df.groupby('col1')['col2'].apply(list)
df = pd.DataFrame(s.tolist(), index=s.index).reset_index().fillna('')
df.columns = [f'col{x}' for x in range(1, len(df.columns) + 1)]
print (df)
    col1    col2    col3    col4    col5
0  file1  text_0  text_1  text_2        
1  file2  text_0  text_1  text_2  text_3
2  file3  text_0                        

Alternative:

s = df.groupby('col1')['col2'].apply(list)

L = [[k] + v for k, v in s.items()]
df = pd.DataFrame(L).fillna('').rename(columns=lambda x: f'col{x+1}')
print (df)
    col1    col2    col3    col4    col5
0  file1  text_0  text_1  text_2        
1  file2  text_0  text_1  text_2  text_3
2  file3  text_0                        
like image 177
jezrael Avatar answered Feb 23 '26 01:02

jezrael