Is it possible to groupby a multi-index (2 levels) pandas dataframe by one of the multi-index levels ?
The only way I know of doing it is to reset_index on a multiindex and then set index again. I am sure there is a better way to do it, and I want to know how.
How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.
What if you could have more than one column as in your DataFrame's index? The multi-level index feature in Pandas allows you to do just that. A regular Pandas DataFrame has a single column that acts as a unique row identifier, or in other words, an “index”. These index values can be numbers, from 0 to infinity.
Python's groupby() function is versatile. It is used to split the data into groups based on some criteria like mean, median, value_counts, etc. In order to reset the index after groupby() we will use the reset_index() function.
Yes, use the level
parameter. Take a look here. Example:
In [26]: s first second third bar doo one 0.404705 two 0.577046 baz bee one -1.715002 two -1.039268 foo bop one -0.370647 two -1.157892 qux bop one -1.344312 two 0.844885 dtype: float64 In [27]: s.groupby(level=['first','second']).sum() first second bar doo 0.981751 baz bee -2.754270 foo bop -1.528539 qux bop -0.499427 dtype: float64
In recent versions of pandas, you can group by multi-index level names similar to columns (i.e. without the level
keyword), allowing you to use both simultaneously.
>>> import pandas as pd >>> pd.__version__ '1.0.5' >>> df = pd.DataFrame({ ... 'first': ['a', 'a', 'a', 'b', 'b', 'b'], ... 'second': ['x', 'y', 'x', 'z', 'y', 'z'], ... 'column': ['k', 'k', 'l', 'l', 'm', 'n'], ... 'data': [0, 1, 2, 3, 4, 5], ... }).set_index(['first', 'second']) >>> df.groupby('first').sum() data first a 3 b 12 >>> df.groupby(['second', 'column']).sum() data second column x k 0 l 2 y k 1 m 4 z l 3 n 5
The column and index level names you groupby
must be unique. If you have a column and index level with the same name, you will get a ValueError
when trying to groupby
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With