Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas groupby and sum total of group

I have a Pandas DataFrame with customer refund reasons. It contains these example data rows:

    **case_type**       **claim_type**
1   service             service
2   service             service
3   chargeback          service
4   chargeback          local_charges
5   service             supplier_service
6   chargeback          service
7   chargeback          service
8   chargeback          service
9   chargeback          service
10  chargeback          service
11  service             service_not_used
12  service             service_not_used

I would like to compare the customer's reason with some sort of labeled reason. This is no problem, but I would also like to see the total number of records in a specific group (customer reason).

case_claim_type = df[["case_type", "claim_type"]]
case_claim_type.groupby(by=("case_type", "claim_type"))["case_type"].count()

Which gives me this output, for example:

**case_type**     **claim_type**                 
service           service                         2
                  supplier_service                1
                  service_not_used                2
chargeback        service                         6
                  local_charges                   1

I would also like to have have the sum of the output per case_type. Something like:

**case_type**     **claim_type**                 
service           service                         2
                  supplier_service                1
                  service_not_used                2
                  total:                          5
chargeback        service                         6
                  local_charges                   1
                  total:                          7

It doesn't necessarily has to be in this last output format, a column with the (aggregated) totals per case_type is also fine.

like image 645
eppe2000 Avatar asked Feb 20 '18 15:02

eppe2000


People also ask

How do I create a new column from the output of pandas Groupby () sum ()?

To create a new column for the output of groupby. sum(), we will first apply the groupby. sim() operation and then we will store this result in a new column.

How do you count a group in pandas?

Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.

How do you count pandas total?

Pandas DataFrame count() MethodThe count() method counts the number of not empty values for each row, or column if you specify the axis parameter as axis='columns' , and returns a Series object with the result for each row (or column).


1 Answers

Where:

df = pd.DataFrame({'case_type':['Service']*20+['chargeback']*9,'claim_type':['service']*5+['local_charges']*5+['service_not_used']*5+['supplier_service']*5+['service']*8+['local_charges']})

df_out = df.groupby(by=("case_type", "claim_type"))["case_type"].count()

Let use pd.concat, sum with level parameter, and assign:

(pd.concat([df_out.to_frame(),
           df_out.sum(level=0).to_frame()
                 .assign(claim_type= "total")
                 .set_index('claim_type', append=True)])
  .sort_index())

Output:

                             case_type
case_type  claim_type                 
Service    local_charges             5
           service                   5
           service_not_used          5
           supplier_service          5
           total                    20
chargeback local_charges             1
           service                   8
           total                     9
like image 124
Scott Boston Avatar answered Oct 22 '22 03:10

Scott Boston