Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas, aggregate multiple columns from one

I'm new to pandas and I have a DataFrame of this kind :

    name    value
0   alpha   a
1   beta    b
2   gamma   c
3   alpha   a
4   beta    b
5   beta    a
6   gamma   a
7   alpha   c

which I would like to turn into one of this kind :

    name    a   b   c
0   alpha   2   0   1
1   beta    1   2   0
2   gamma   1   0   1

That is to say I would like to group by "name" and "value", then count them, and create a column for each value of "value" I find.

like image 292
Drico Avatar asked Apr 01 '26 00:04

Drico


1 Answers

It is just a cross tabulation:

In [78]:

print pd.crosstab(df.name, df.value)
value  a  b  c
name          
alpha  2  0  1
beta   1  2  0
gamma  1  0  1

If you use groupby:

In [90]:

print df.groupby(['name', 'value']).agg(len).unstack().fillna(0)
value  a  b  c
name          
alpha  2  0  1
beta   1  2  0
gamma  1  0  1

The latter might be faster:

In [92]:

%timeit df.groupby(['name', 'value']).agg(len).unstack().fillna(0)
100 loops, best of 3: 3.26 ms per loop
In [93]:

%timeit pd.crosstab(df.name, df.value)
100 loops, best of 3: 7.5 ms per loop
like image 94
CT Zhu Avatar answered Apr 03 '26 16:04

CT Zhu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!