Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort pandas data frame using values from several columns?

I have the following data frame:

df = pandas.DataFrame([{'c1':3,'c2':10},{'c1':2, 'c2':30},{'c1':1,'c2':20},{'c1':2,'c2':15},{'c1':2,'c2':100}]) 

Or, in human readable form:

   c1   c2 0   3   10 1   2   30 2   1   20 3   2   15 4   2  100 

The following sorting-command works as expected:

df.sort(['c1','c2'], ascending=False) 

Output:

   c1   c2 0   3   10 4   2  100 1   2   30 3   2   15 2   1   20 

But the following command:

df.sort(['c1','c2'], ascending=[False,True]) 

results in

   c1   c2 2   1   20 3   2   15 1   2   30 4   2  100 0   3   10 

and this is not what I expect. I expect to have the values in the first column ordered from largest to smallest, and if there are identical values in the first column, order by the ascending values from the second column.

Does anybody know why it does not work as expected?

ADDED

This is copy-paste:

>>> df.sort(['c1','c2'], ascending=[False,True])    c1   c2 2   1   20 3   2   15 1   2   30 4   2  100 0   3   10 
like image 453
Roman Avatar asked Jul 12 '13 15:07

Roman


People also ask

How do you sort data frames based on columns?

Sorting Your DataFrame on a Single Column. To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.


2 Answers

DataFrame.sort is deprecated; use DataFrame.sort_values.

>>> df.sort_values(['c1','c2'], ascending=[False,True])    c1   c2 0   3   10 3   2   15 1   2   30 4   2  100 2   1   20 >>> df.sort(['c1','c2'], ascending=[False,True]) Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "/Users/ampawake/anaconda/envs/pseudo/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__     return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'sort' 
like image 192
falsetru Avatar answered Oct 02 '22 07:10

falsetru


Use of sort can result in warning message. See github discussion. So you might wanna use sort_values, docs here

Then your code can look like this:

df = df.sort_values(by=['c1','c2'], ascending=[False,True]) 
like image 25
HonzaB Avatar answered Oct 02 '22 06:10

HonzaB