Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert dataframe column values to new columns

I have a dataframe containing some data, which I want to transform, so that the values of one column define the new columns.

>>> import pandas as pd
>>> df = pd.DataFrame([['a','a','b','b'],[6,7,8,9]]).T
>>> df
   A  B
0  a  6
1  a  7
2  b  8
3  b  9

The values of the column A shall be the column names of the new dataframe. The result of the transformation should look like this:

   a  b
0  6  8
1  7  9

What I came up with so far didn't work completely:

>>> pd.DataFrame({ k : df.loc[df['A'] == k, 'B'] for k in df['A'].unique() })
     a    b
0    6  NaN
1    7  NaN
2  NaN    8
3  NaN    9

Besides this being incorrect, I guess there probably is a more efficient way anyway. I'm just really having a hard time understanding how to handle things with pandas.

like image 842
KorbenDose Avatar asked Feb 04 '23 23:02

KorbenDose


2 Answers

You were almost there but you need the .values as the list of array and then provide the column names.

pd.DataFrame(pd.DataFrame({ k : df.loc[df['A'] == k, 'B'].values for k in df['A'].unique() }), columns=df['A'].unique())

Output:

    a   b
0   6   8
1   7   9
like image 180
harvpan Avatar answered Feb 06 '23 14:02

harvpan


Using a dictionary comprehension with groupby:

res = pd.DataFrame({col: vals.loc[:, 1].values for col, vals in df.groupby(0)})

print(res)

   a  b
0  6  8
1  7  9
like image 34
jpp Avatar answered Feb 06 '23 15:02

jpp