I have a dataframe with a 3-level deep multi-index on the columns. I would like to compute subtotals across rows (sum(axis=1)
) where I sum across one of the levels while preserving the others. I think I know how to do this using the level
keyword argument of pd.DataFrame.sum
. However, I'm having trouble thinking of how to incorporate the result of this sum back into the original table.
Setup:
import numpy as np
import pandas as pd
from itertools import product
np.random.seed(0)
colors = ['red', 'green']
shapes = ['square', 'circle']
obsnum = range(5)
rows = list(product(colors, shapes, obsnum))
idx = pd.MultiIndex.from_tuples(rows)
idx.names = ['color', 'shape', 'obsnum']
df = pd.DataFrame({'attr1': np.random.randn(len(rows)),
'attr2': 100 * np.random.randn(len(rows))},
index=idx)
df.columns.names = ['attribute']
df = df.unstack(['color', 'shape'])
Gives a nice frame like so:
Say I wanted to reduce the shape
level. I could run:
tots = df.sum(axis=1, level=['attribute', 'color'])
to get my totals like so:
Once I have this, I'd like to tack it on to the original frame. I think I can do this in a somewhat cumbersome way:
tots = df.sum(axis=1, level=['attribute', 'color'])
newcols = pd.MultiIndex.from_tuples(list((i[0], i[1], 'sum(shape)') for i in tots.columns))
tots.columns = newcols
bigframe = pd.concat([df, tots], axis=1).sort_index(axis=1)
Is there a more natural way to do this?
Here is a way without loops:
s = df.sum(axis=1, level=[0,1]).T
s["shape"] = "sum(shape)"
s.set_index("shape", append=True, inplace=True)
df.combine_first(s.T)
The trick is to use the transposed sum. So we can insert another column (i.e. row) with the name of the additional level, which we name exactly like the one we summed over. This column can be converted to a level in the index with set_index
. Then we combine df
with the transposed sum. If the summed level is not the last one you might need some level reordering.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With