I have a pandas data frame counted and grouped by specific columns.
import pandas as pd
df = pd.DataFrame({'x':list('aaabbbbbccccc'),'y':list('2225555577777'), 'z':list('1312223224432')})
#
df.groupby(['x','y','z'])['z'].count()
# or
df.groupby(['x','y','z'])['z'].agg(['count'])
# or
df.groupby(['x','y','z'])['z'].count().reset_index(name='counts')
Results is;
x y z counts
0 a 2 1 2
1 a 2 3 1
2 b 5 2 4
3 b 5 3 1
4 c 7 2 2
5 c 7 3 1
6 c 7 4 2
How can I convert the result to following form?
x y 1 2 3 4
0 a 2 2 0 1 0
1 b 5 0 4 1 0
2 c 7 0 2 1 2
You will need to use unstack
+ reset_index
:
(df.groupby(['x','y','z'])['z']
.count()
.unstack(-1, fill_value=0)
.reset_index()
.rename_axis(None, axis=1)
)
x y 1 2 3 4
0 a 2 2 0 1 0
1 b 5 0 4 1 0
2 c 7 0 2 1 2
Note, you can replace df.groupby(['x','y','z'])['z'].count()
with df.groupby(['x','y','z']).size()
for compactness, but beware that size
also counts NaNs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With