Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas - Sort Values by keeping a specific order

I am trying to sort values by keeping the index in a specific order.

from random import randint
import pandas as pd
days = ["Tuesday", "Thursday", "Monday", "Wednesday"]
a = pd.DataFrame({"Value": [randint(0, 9) for i in range(len(days)*5)],
                  "Year": [y for i in range(len(days)) for y in range(2014,2019)]}, 
                  index=[day for day in days for i in range(5)])
myorder = ["Monday", "Tuesday", "Wednesday", "Thursday"]
a.index = pd.CategoricalIndex(a.index, categories=myorder, ordered=True)
a = a.sort_index()

By applying a.sort_index() I get my specific order. However, values of Year are random. If we naively a.sort_values(["Year"]), it modifies again the index order. How can I sort the Year values by keeping my initial index order ?

like image 545
J. Doe Avatar asked Feb 15 '18 11:02

J. Doe


People also ask

How do you sort a Dataframe in a specific order?

You can sort by column values in pandas DataFrame using sort_values() method. To specify the order, you have to use ascending boolean property; False for descending and True for ascending. By default, it is set to True.

Which function is pandas is used to arrange values in a particular order?

sort_index() , which will enable you to sort data efficiently in a DataFrame. By the end of this tutorial, you'll know how to: Sort a pandas DataFrame by the values of one or more columns. Use the ascending parameter to change the sort order.

How do I sort values in pandas series?

Sort the Series in Ascending Order By default, the pandas series sort_values() function sorts the series in ascending order. You can also use ascending=True param to explicitly specify to sort in ascending order. Also, if you have any NaN values in the Series, it sort by placing all NaN values at the end.

How do you sort a Dataframe by a specific column in Python?

Parameters: This method will take following parameters : by: Single/List of column names to sort Data Frame by. axis: 0 or 'index' for rows and 1 or 'columns' for Column. ascending: Boolean value which sorts Data frame in ascending order if True.


2 Answers

You need create column from index and sort together:

a = a.reset_index().sort_values(['index','Year']).set_index('index').rename_axis(None)

Or create MultiIndex from column and sort together:

a = (a.set_index('Year', append=True)
      .sort_index()
      .reset_index(level=1)
      .reindex(columns=a.columns))

print (a)
           Value  Year
Monday         7  2014
Monday         3  2015
Monday         2  2016
Monday         5  2017
Monday         4  2018
Tuesday        6  2014
Tuesday        0  2015
Tuesday        0  2016
Tuesday        9  2017
Tuesday        2  2018
Wednesday      6  2014
Wednesday      7  2015
Wednesday      5  2016
Wednesday      5  2017
Wednesday      5  2018
Thursday       3  2014
Thursday       2  2015
Thursday       8  2016
Thursday       7  2017
Thursday       7  2018
like image 167
jezrael Avatar answered Oct 04 '22 21:10

jezrael


Non-categorical approach, sorting by customized index order & Year simultaneously:

orderdic = dict(zip(myorder, range(len(myorder))))

a = a.assign(order=a.index.to_series().map(orderdic))\
      .sort_values(['order', 'Year']).drop('order', 1)

#            Value  Year
# Monday         2  2014
# Monday         4  2015
# Monday         8  2016
# Monday         8  2017
# Monday         7  2018
# Tuesday        5  2014
# Tuesday        4  2015
# Tuesday        0  2016
# Tuesday        1  2017
# Tuesday        3  2018
# Wednesday      2  2014
# Wednesday      8  2015
# Wednesday      4  2016
# Wednesday      3  2017
# Wednesday      4  2018
# Thursday       7  2014
# Thursday       4  2015
# Thursday       7  2016
# Thursday       2  2017
# Thursday       1  2018
like image 23
jpp Avatar answered Oct 04 '22 20:10

jpp