I would like to compute the total sum on each multi-index sublevel. And then, save it in the dataframe.
My current dataframe looks like:
values
first second
bar one 0.106521
two 1.964873
baz one 1.289683
two -0.696361
foo one -0.309505
two 2.890406
qux one -0.758369
two 1.302628
And the needed result is:
values
first second
bar one 0.106521
two 1.964873
total 2.071394
baz one 1.289683
two -0.696361
total 0.593322
foo one -0.309505
two 2.890406
total 2.580901
qux one -0.758369
two 1.302628
total 0.544259
total one 0.328331
two 5.461546
total 5.789877
Currently I found the folowing implementation that works. But I would like to know if there are better options. I need the fastest solution possible, because in some cases when my dataframes become huge, the computation time seems to take ages.
In [1]: arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
...: ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
...:
In [2]: tuples = list(zip(*arrays))
In [3]: index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
In [4]: s = Series(randn(8), index=index)
In [5]: d = {'values': s}
In [6]: df = DataFrame(d)
In [7]: for col in df.index.names:
.....: df = df.unstack(col)
.....: df[('values', 'total')] = df.sum(axis=1)
.....: df = df.stack()
.....:
Not sure if you are still looking for an answer to this - you could try something like this, assuming your current dataframe is assigned to df
:
temp = df.pivot(index='first', columns='second', values='values')
temp['total'] = temp['one'] + temp['two']
temp.stack()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With