Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas simple pivot table sum count

I'm trying to identify the best way to make a simple pivot on my data:

import pandas    
dfn = pandas.DataFrame({
    "A" : [ 'aaa', 'bbb', 'aaa', 'bbb' ],
    "B" : [     1,    10,     2,   30  ],
    "C" : [     2,     0,     3,   20  ] })

The output I would like to have is a dataframe, grouped by A, that sum and count values of B and C, and names have to be exactly (Sum_B, Sum_C, Count), as following:

A   Sum_B  Sum_C  Count
aaa    3      5       2
bbb   50     20       2

What is the fastest way to do this?

like image 860
DPColombotto Avatar asked Jun 22 '16 10:06

DPColombotto


People also ask

How do you show the sum and count in a pivot table?

Often you may want to calculate the sum and the count of the same field in a pivot table in Excel. You can easily do this by dragging the same field into the Values box twice when creating a pivot table.

How do you sum a pivot table in Python?

You can use the aggfunc= (aggregation function) parameter to change how data are aggregated in a pivot table. By default, Pandas will use the . mean() method to aggregate data. You can pass a named function, such as 'mean' , 'sum' , or 'max' , or a function callable such as np.

How do I count pivot in pandas?

Counting distinct values in Pandas pivot If we want to count the unique occurrences of a specific observation (row) we'll need to use a somewhat different aggregation method. aggfunc= pd. Series. nunique will allow us to count only the distinct rows in the DataFrame that we pivoted.

How do I get the sum of values in pandas?

Pandas DataFrame sum() MethodThe sum() method adds all values in each column and returns the sum for each column. By specifying the column axis ( axis='columns' ), the sum() method searches column-wise and returns the sum of each row.


1 Answers

you can use .agg() function:

In [227]: dfn.groupby('A').agg({'B':sum, 'C':sum, 'A':'count'}).rename(columns={'A':'count'})
Out[227]:
      B  count   C
A
aaa   3      2   5
bbb  40      2  20

or with reset_index():

In [239]: dfn.groupby('A').agg({'B':sum, 'C':sum, 'A':'count'}).rename(columns={'A':'count'}).reset_index()
Out[239]:
     A   B  count   C
0  aaa   3      2   5
1  bbb  40      2  20

PS Here is a link to examples provided by @evan54

like image 165
MaxU - stop WAR against UA Avatar answered Sep 28 '22 07:09

MaxU - stop WAR against UA