Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi Index Sorting in Pandas

I have a dataset with multi-index columns in a pandas df that I would like to sort by values in a specific column. I have tried using sortindex and sortlevel but haven't been able get the results I am looking for. My dataset looks like:

    Group1    Group2     A B C     A B C 1   1 0 3     2 5 7 2   5 6 9     1 0 0 3   7 0 2     0 3 5  

I want to sort all data and the index by column C in Group 1 in descending order so my results look like:

    Group1    Group2     A B C     A B C  2  5 6 9     1 0 0  1  1 0 3     2 5 7  3  7 0 2     0 3 5  

Is it possible to do this sort with the structure that my data is in, or should I be swapping Group1 to the index side?

like image 462
MattB Avatar asked Feb 06 '13 16:02

MattB


People also ask

How do you sort multiple index values?

However, to sort MultiIndex at a specific level, use the multiIndex. sortlevel() method in Pandas. Set the level as an argument. To sort in descending order, use the ascending parameter and set to False.

What is a MultiIndex pandas?

The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.

How do I reorder multiple index columns in pandas?

To rearrange levels in MultiIndex, use the MultiIndex. reorder_levels() method in Pandas. Set the order of levels using the order parameter.

How does pandas handle multiple index columns?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero. Yields below output.


1 Answers

When sorting by a MultiIndex you need to contain the tuple describing the column inside a list*:

In [11]: df.sort_values([('Group1', 'C')], ascending=False) Out[11]:    Group1       Group2              A  B  C      A  B  C 2      5  6  9      1  0  0 1      1  0  3      2  5  7 3      7  0  2      0  3  5 

* so as not to confuse pandas into thinking you want to sort first by Group1 then by C.


Note: Originally used .sort since deprecated then removed in 0.20, in favor of .sort_values.

like image 61
Andy Hayden Avatar answered Sep 22 '22 10:09

Andy Hayden