Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort pandas dataframe both on values of a column and index?

Is it feasible to sort pandas dataframe by values of a column, but also by index?

If you sort a pandas dataframe by values of a column, you can get the resultant dataframe sorted by the column, but unfortunately, you see the order of your dataframe's index messy within the same value of a sorted column.

So, can I sort a dataframe by a column, such as the column named count but also sort it by the value of index? And is it also feasible to sort a column by descending order, but whereas sort a index by ascending order?

I know how to sort multiple columns in dataframe, and also know I can achieve what I'm asking here by first reset_index() the index and sort it, and then create the index again. But is it more intuitive and efficient way to do it?

like image 713
Blaszard Avatar asked Nov 29 '13 02:11

Blaszard


People also ask

What does sort_index do in pandas?

sort_index() function sorts objects by labels along the given axis. Basically the sorting algorithm is applied on the axis labels rather than the actual data in the dataframe and based on that the data is rearranged.

How do you sort DataFrame based on column values?

To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.

How do I sort by two values in pandas?

You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order. To specify the order, you have to use ascending boolean property; False for descending and True for ascending.


2 Answers

Pandas 0.23 finally gets you there :-D

You can now pass index names (and not only column names) as parameters to sort_values. So, this one-liner works:

df = df.sort_values(by = ['MyCol', 'MyIdx'], ascending = [False, True])

And if your index is currently unnamed:

df = df.rename_axis('MyIdx').sort_values(by = ['MyCol', 'MyIdx'], ascending = [False, True])
like image 101
OmerB Avatar answered Oct 17 '22 18:10

OmerB


In pandas 0.23+ you can do it directly - see OmerB's answer. If you don't yet have 0.23+, read on.


I'd venture that the simplest way is to just copy your index over to a column, and then sort by both.

df['colFromIndex'] = df.index
df = df.sort(['count', 'colFromIndex'])

I'd also prefer to be able to just do something like df.sort(['count', 'index']), but of course that doesn't work.

like image 33
fantabolous Avatar answered Oct 17 '22 20:10

fantabolous