Is it possible to groupby a multi-index (2 levels) pandas dataframe by one of the multi-index levels ? The only way I know of doing it is to reset_index on a multiindex and then set index again. I am sure there is a better way to do it, and I want to know how.

In recent versions of pandas, you can group by multi-index level names similar to columns (i.e. without the <code>level</code> keyword), allowing you to use both simultaneously. <pre class="prettyprint"><code>>>> import pandas as pd >>> pd.__version__ '1.0.5' >>> df = pd.DataFrame({ ... 'first': ['a', 'a', 'a', 'b', 'b', 'b'], ... 'second': ['x', 'y', 'x', 'z', 'y', 'z'], ... 'column': ['k', 'k', 'l', 'l', 'm', 'n'], ... 'data': [0, 1, 2, 3, 4, 5], ... }).set_index(['first', 'second']) >>> df.groupby('first').sum() data first a 3 b 12 >>> df.groupby(['second', 'column']).sum() data second column x k 0 l 2 y k 1 m 4 z l 3 n 5 </code></pre> The column and index level names you <code>groupby</code> must be unique. If you have a column and index level with the same name, you will get a <code>ValueError</code> when trying to <code>groupby</code>.

Group a multi-indexed pandas dataframe by one of its levels?

2 Answers

Yes, use the level parameter. Take a look here. Example:

In [26]: s  first  second  third bar    doo     one      0.404705                two      0.577046 baz    bee     one     -1.715002                two     -1.039268 foo    bop     one     -0.370647                two     -1.157892 qux    bop     one     -1.344312                two      0.844885 dtype: float64  In [27]: s.groupby(level=['first','second']).sum()  first  second bar    doo       0.981751 baz    bee      -2.754270 foo    bop      -1.528539 qux    bop      -0.499427 dtype: float64

104

answered Sep 18 '22 06:09

elyase

In recent versions of pandas, you can group by multi-index level names similar to columns (i.e. without the level keyword), allowing you to use both simultaneously.

>>> import pandas as pd >>> pd.__version__ '1.0.5' >>> df = pd.DataFrame({ ...     'first': ['a', 'a', 'a', 'b', 'b', 'b'], ...     'second': ['x', 'y', 'x', 'z', 'y', 'z'], ...     'column': ['k', 'k', 'l', 'l', 'm', 'n'], ...     'data': [0, 1, 2, 3, 4, 5], ... }).set_index(['first', 'second']) >>> df.groupby('first').sum()        data first       a         3 b        12 >>> df.groupby(['second', 'column']).sum()                data second column       x      k          0        l          2 y      k          1        m          4 z      l          3        n          5

The column and index level names you groupby must be unique. If you have a column and index level with the same name, you will get a ValueError when trying to groupby.

answered Sep 22 '22 06:09

HoosierDaddy

Related questions
                            
                                Overloads of std::minmax() and std::tie
                            
                                SSL Proxy with Genymotion and Charles?
                            
                                DataGridView Event to Catch When Cell Value Has Been Changed by User
                            
                                EntityType 'DbGeography' has no key defined
                            
                                Reading an ASC file into R
                            
                                What do you need to pass to v4.widget.DrawerLayout.isDrawerOpen()/.openDrawer()/.closeDrawer()
                            
                                Understanding `andThen`
                            
                                Some R packages do not update with update.packages()
                            
                                Can you explain Docker with a practical example/case? [closed]
                            
                                How to set default value on an input box with select2 initialized on it?
                            
                                Error : Index was outside the bounds of the array. [duplicate]
                            
                                Youtube API V3 and Etag

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Group a multi-indexed pandas dataframe by one of its levels?

Tags:

silencer

People also ask

2 Answers

elyase

HoosierDaddy

Recent Activity

Donate For Us