Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to divide two groupby objects in pandas?

I have the following code:

import numpy as np
import pandas as pd
obs = pd.DataFrame({
        'storm': [1, 1, 1, 1, 0, 0, 0, 0], 
        'lightning': [1, 1, 0, 0, 1, 1, 0, 0], 
        'thunder': [1, 0, 1, 0, 1, 0, 1, 0],
        'p': [0.20, 0.05, 0.04, 0.36, 0.04, 0.01, 0.03, 0.27]
    })
g1=obs.groupby(['lightning','thunder']).agg({'p':'sum'})
g2=obs.groupby(['lightning','thunder','storm']).agg({'p':'sum'})

which gives

enter image description here

Now how to divide more detailed groupby by less detailed (to calculate percentage)?

I have read this Pandas percentage of total with groupby but was unable to derive how to rewrite for my case.

like image 482
Dims Avatar asked Jun 28 '16 19:06

Dims


People also ask

How do you split a GroupBy in pandas?

Step 1: split the data into groups by creating a groupby object from the original DataFrame; Step 2: apply a function, in this case, an aggregation function that computes a summary statistic (you can also transform or filter your data in this step); Step 3: combine the results into a new DataFrame.

How do you divide two values in pandas?

The simple division (/) operator is the first way to divide two columns. You will split the First Column with the other columns here. This is the simplest method of dividing two columns in Pandas.

How do I divide one data frame by another?

div() method divides element-wise division of one pandas DataFrame by another. DataFrame elements can be divided by a pandas series or by a Python sequence as well. Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).

Can you group by two things in pandas?

Grouping by multiple columns with multiple aggregations functions. Can you groupby your data set multiple columns in Pandas? You bet! Here's an example of multiple aggregations per grouping, each with their specific calculated function: a sum of the aggregating column and an average calculation.


1 Answers

g2.unstack() to get last level into columns. Then divide, broadcasting over columns. Then stack again.

g2.unstack().div(g1.p, axis=0).stack()

enter image description here

like image 200
piRSquared Avatar answered Sep 18 '22 11:09

piRSquared