I have a dataframe as below.
a x 10
b x 11
c x 15
a y 16
b y 17
c y 19
a z 20
b z 21
c z 23
and I want to transform it as below:
x y z
a 10 16 20
b 11 17 21
c 15 19 23
Currently I am making the original DF into multiple Data Frames (each for "a", "b" and "c") and then transposing and merging back.
I am sure there would be an optimum solution. Hence looking for help.
Use pivot
:
print (df)
A B C
0 a x 10
1 b x 11
2 c x 15
3 a y 16
4 b y 17
5 c y 19
6 a z 20
7 b z 21
8 c z 23
df = df.pivot(index='A', columns='B', values='C')
print (df)
B x y z
A
a 10 16 20
b 11 17 21
c 15 19 23
Or set_index
+ unstack
:
df = df.set_index(['A','B'])['C'].unstack()
print (df)
B x y z
A
a 10 16 20
b 11 17 21
c 15 19 23
If duplicates use pivot_table
with aggregate function like mean
, sum
...:
print (df)
A B C
0 a x 10 <-same a,x different C = 10
1 a x 13 <-same a,x different C = 13
2 b x 11
3 c x 15
4 a y 16
5 b y 17
6 c y 19
7 a z 20
8 b z 21
9 c z 23
df = df.pivot_table(index='A', columns='B', values='C', aggfunc='mean')
Or groupby
+ aggregate function
+ set_index
:
df = df.groupby(['A','B'])['C'].mean().unstack()
print (df)
B x y z
A
a 11.5 16.0 20.0 <- (10 + 13) / 2 = 11.5
b 11.0 17.0 21.0
c 15.0 19.0 23.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With