I have a dataframe containing some data, which I want to transform, so that the values of one column define the new columns.
>>> import pandas as pd
>>> df = pd.DataFrame([['a','a','b','b'],[6,7,8,9]]).T
>>> df
A B
0 a 6
1 a 7
2 b 8
3 b 9
The values of the column A
shall be the column names of the new dataframe. The result of the transformation should look like this:
a b
0 6 8
1 7 9
What I came up with so far didn't work completely:
>>> pd.DataFrame({ k : df.loc[df['A'] == k, 'B'] for k in df['A'].unique() })
a b
0 6 NaN
1 7 NaN
2 NaN 8
3 NaN 9
Besides this being incorrect, I guess there probably is a more efficient way anyway. I'm just really having a hard time understanding how to handle things with pandas.
You were almost there but you need the .values
as the list of array and then provide the column names.
pd.DataFrame(pd.DataFrame({ k : df.loc[df['A'] == k, 'B'].values for k in df['A'].unique() }), columns=df['A'].unique())
Output:
a b
0 6 8
1 7 9
Using a dictionary comprehension with groupby
:
res = pd.DataFrame({col: vals.loc[:, 1].values for col, vals in df.groupby(0)})
print(res)
a b
0 6 8
1 7 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With