I have a DataFrame with MultiIndex looking like this after printing in the console: <pre class="prettyprint"> value indA indB scenarioId group 2015-04-13 1 A -54.0 1.0 1.0 B -160.0 1.0 1.0 C -15.0 0.0 1.0 2 A -83.0 1.0 1.0 3 A -80.0 2.0 2.0 4 A -270.0 2.0 2.0 2015-04-14 1 A -56.0 1.0 1.0 B -1.0 1.0 1.0 C -60.0 0.0 1.0 2 A -32.0 1.0 1.0 3 A -91.0 2.0 2.0 4 A -17.0 2.0 2.0 </pre> I got it after I used the <code>groupby</code> and <code>sum</code> functions on my initial dataset. I would like to keep the same format, but order it according to the <code>value</code> column. I have tried hard to do it using the sorting functions, but I think that the fact of having the first index (for the dates) of the MultiIndex without name is a problem. Essentially, the output should look like this: <pre class="prettyprint"> value indA indB scenarioId group 2015-04-13 1 B -160.0 1.0 1.0 A -54.0 1.0 1.0 C -15.0 0.0 1.0 2 A -83.0 1.0 1.0 3 A -80.0 2.0 2.0 4 A -270.0 2.0 2.0 2015-04-14 1 C -60.0 1.0 1.0 A -56.0 1.0 1.0 B -1.0 0.0 1.0 2 A -32.0 1.0 1.0 3 A -91.0 2.0 2.0 4 A -17.0 2.0 2.0 </pre> Could someone enlighten me on this please? Thanks in advance.

You can use <code>sort_values</code> + <code>sort_index</code>: <pre class="prettyprint"><code>print (df.sort_values('value').sort_index(level=[0,1], sort_remaining=False)) value indA indB scenarioId group 2015-04-13 1 B -160.0 1.0 1.0 A -54.0 1.0 1.0 C -15.0 0.0 1.0 2 A -83.0 1.0 1.0 3 A -80.0 2.0 2.0 4 A -270.0 2.0 2.0 2015-04-14 1 C -60.0 0.0 1.0 A -56.0 1.0 1.0 B -1.0 1.0 1.0 2 A -32.0 1.0 1.0 3 A -91.0 2.0 2.0 4 A -17.0 2.0 2.0 </code></pre> Another solution - <code>sort_values</code> with <code>reset_index</code> and <code>set_index</code>: <pre class="prettyprint"><code>df = df.reset_index() .sort_values(['level_0','scenarioId','value']) .set_index(['level_0','scenarioId','group']) print (df) value indA indB level_0 scenarioId group 2015-04-13 1 B -160.0 1.0 1.0 A -54.0 1.0 1.0 C -15.0 0.0 1.0 2 A -83.0 1.0 1.0 3 A -80.0 2.0 2.0 4 A -270.0 2.0 2.0 2015-04-14 1 C -60.0 0.0 1.0 A -56.0 1.0 1.0 B -1.0 1.0 1.0 2 A -32.0 1.0 1.0 3 A -91.0 2.0 2.0 4 A -17.0 2.0 2.0 </code></pre>

Sort pandas DataFrame with MultiIndex according to column value

Tags:

python

pandas

dataframe

multi-index

I have a DataFrame with MultiIndex looking like this after printing in the console:

                             value  indA  indB
           scenarioId group                        
2015-04-13    1       A           -54.0   1.0   1.0
                      B          -160.0   1.0   1.0
                      C           -15.0   0.0   1.0
              2       A           -83.0   1.0   1.0
              3       A           -80.0   2.0   2.0
              4       A          -270.0   2.0   2.0
2015-04-14    1       A           -56.0   1.0   1.0
                      B            -1.0   1.0   1.0
                      C           -60.0   0.0   1.0
              2       A           -32.0   1.0   1.0
              3       A           -91.0   2.0   2.0
              4       A           -17.0   2.0   2.0

I got it after I used the groupby and sum functions on my initial dataset.

I would like to keep the same format, but order it according to the value column. I have tried hard to do it using the sorting functions, but I think that the fact of having the first index (for the dates) of the MultiIndex without name is a problem.

Essentially, the output should look like this:

                             value  indA  indB
           scenarioId group                        
2015-04-13   1        B          -160.0   1.0   1.0
                      A           -54.0   1.0   1.0
                      C           -15.0   0.0   1.0
             2        A           -83.0   1.0   1.0
             3        A           -80.0   2.0   2.0
             4        A          -270.0   2.0   2.0
2015-04-14   1        C           -60.0   1.0   1.0
                      A           -56.0   1.0   1.0
                      B            -1.0   0.0   1.0
             2        A           -32.0   1.0   1.0
             3        A           -91.0   2.0   2.0
             4        A           -17.0   2.0   2.0

Could someone enlighten me on this please?

Thanks in advance.

856

asked Apr 18 '17 12:04

JejeBelfort

1 Answers

You can use sort_values + sort_index:

print (df.sort_values('value').sort_index(level=[0,1], sort_remaining=False))
                             value  indA  indB
           scenarioId group                   
2015-04-13 1          B     -160.0   1.0   1.0
                      A      -54.0   1.0   1.0
                      C      -15.0   0.0   1.0
           2          A      -83.0   1.0   1.0
           3          A      -80.0   2.0   2.0
           4          A     -270.0   2.0   2.0
2015-04-14 1          C      -60.0   0.0   1.0
                      A      -56.0   1.0   1.0
                      B       -1.0   1.0   1.0
           2          A      -32.0   1.0   1.0
           3          A      -91.0   2.0   2.0
           4          A      -17.0   2.0   2.0

Another solution - sort_values with reset_index and set_index:

df = df.reset_index()
       .sort_values(['level_0','scenarioId','value'])
       .set_index(['level_0','scenarioId','group'])
print (df)
                             value  indA  indB
level_0    scenarioId group                   
2015-04-13 1          B     -160.0   1.0   1.0
                      A      -54.0   1.0   1.0
                      C      -15.0   0.0   1.0
           2          A      -83.0   1.0   1.0
           3          A      -80.0   2.0   2.0
           4          A     -270.0   2.0   2.0
2015-04-14 1          C      -60.0   0.0   1.0
                      A      -56.0   1.0   1.0
                      B       -1.0   1.0   1.0
           2          A      -32.0   1.0   1.0
           3          A      -91.0   2.0   2.0
           4          A      -17.0   2.0   2.0

answered Oct 09 '22 23:10

jezrael

Related questions
                            
                                Python argparse help-like option
                            
                                Why do these dtypes compare equal but hash different?
                            
                                Plotting DataFrame with column in all subplots
                            
                                Iterator selector in Python
                            
                                How to keep ssh session not expired using paramiko?
                            
                                How can I extract local variables from a stack trace?
                            
                                Client IP Address to Closest AWS Region
                            
                                Getting 500 Internal Server Error when setting up Python and Flask with FastCgiModule on Windows
                            
                                How to find all terms in an expression in Sympy
                            
                                mod_wsgi: Reload Code via Inotify - not every N seconds
                            
                                Reading YAML config file in python and using variables
                            
                                Pandas - Group/bins of data per longitude/latitude
                            
                                Logging not captured on behave steps
                            
                                Why is NotImplemented evaluated multiple times with __eq__ operator
                            
                                Python Spectrum Analysis
                            
                                How to remove index list from another list in python? [duplicate]
                            
                                Storing JSON into database in python
                            
                                QuerySet is not JSON Serializable Django
                            
                                Faster numpy-solution instead of itertools.combinations?
                            
                                Is it possible to run SLURM jobs in the background using SRUN instead of SBATCH?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With