Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Transform rows to columns by the values of two rows in pandas

Tags:

python

pandas

I have a large dataset which has two columns Name, Value and it looks like this:

import pandas as pd
data = [['code',10],['classe',12],['series','B'], ['code',12],['classe',1],
['series','C'],['code',16],['classe',18],['series','A']]
df1 = pd.DataFrame(data,columns=['Name','Value'])
df1

Output

    Name    Value
0   code    10
1   classe  12
2   series  B
3   code    12
4   classe  1
5   series  C
6   code    16
7   classe  18
8   series  A

And I want some thing like that:

    code  classe series
0   10      10    B
1   12      1     C
2   16      18    A

In my dataset it reapts N time and i want to transform it to three columns code, classe, series.

Thanks for your help in advance!

like image 986
M-M Avatar asked Apr 20 '18 14:04

M-M


People also ask

How do I convert rows to columns in pandas?

Pandas DataFrame: transpose() function The transpose() function is used to transpose index and columns. Reflect the DataFrame over its main diagonal by writing rows as columns and vice-versa. If True, the underlying data is copied. Otherwise (default), no copy is made if possible.

How do I convert rows to columns in Python?

Use the T attribute or the transpose() method to swap (= transpose) the rows and columns of DataFrame. Neither method changes an original object but returns the new object with the rows and columns swapped (= transposed object).


1 Answers

You can accomplish this using .pivot

df2 = df1.pivot(columns='Name', values='Value')
pd.concat([df2[series].dropna().reset_index(drop=True) for series in df2], axis=1)

Output

  classe    code    series
0   12       10     B
1   1        12     C
2   18       16     A

More so, if you changed the ordered data, you still get the desired output:

import pandas as pd
data = [['code',10],['classe',12],['classe', 14], ['series','B'], ['series', 'C'], ['code',12],['classe',1],
['series','C'],['code',16],['classe',18],['series','A']]
df1 = pd.DataFrame(data,columns=['Name','Value'])
df1

    Name    Value
0   code    10
1   classe  12
2   classe  14 #Added classe
3   series  B
4   series  C  #Added Series
5   code    12
6   classe  1
7   series  C
8   code    16
9   classe  18
10  series  A

The output will be:

   classe   code    series
0   12       10      B
1   14       12      C
2   1        16      C
3   18      NaN      A
like image 51
iDrwish Avatar answered Sep 29 '22 03:09

iDrwish