I have a multi-index series that looks like
value
foo bar baz
1 A C 6
D 2
B D 6
F 4
2 B C 5
F 7
I would like to sum on foo and bar, to get the sum of values for each foo, bar, regardless of baz, which I can achieve with df.groupby(level=[0, 1]).sum()
. This series looks like:
sum_value
foo bar
1 A 8
B 10
2 B 12
However, I would then like to divide the original value
by the new sum_value
, to get the percentage of baz, given foo and bar.
value
foo bar baz
1 A C 6/8=.75
D 2/8=.25
B D 6/10=.6
F 4/10=.5
2 B C 5/12=.42
F 7/12=.58
I have tried df.div(df.groupby(level=[0, 1]).sum())
, but get a Not Implemented error. Thanks!
The simple division (/) operator is the first way to divide two columns. You will split the First Column with the other columns here. This is the simplest method of dividing two columns in Pandas. We will import Pandas and take at least two columns while declaring the variables.
A multi-index (also known as hierarchical index) dataframe uses more than one column as the index of the dataframe.
Multiplying of two pandas. Series objects can be done through applying the multiplication operator “*” as well. Through mul() method, handling None values in the data is possible by replacing them with a default value using the parameter fill_value.
You could do it like this using transform
to get sum with like indexes of oringal dataframe then use div
with Pandas intrinsic data alignment:
df.div(df.groupby(['foo','bar']).transform('sum'))
Output:
value
foo bar baz
1 A C 0.750000
D 0.250000
B D 0.600000
F 0.400000
2 B C 0.416667
F 0.583333
In [40]: df['value'] = df.reset_index('baz', drop=True).div(df.sum(level=[0,1])).values
In [41]: df
Out[41]:
value
foo bar baz
1.0 A C 0.750000
D 0.250000
B D 0.600000
F 0.400000
2.0 B C 0.416667
F 0.583333
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With