I know that this question has been asked several times. But none of the answers match my case.
I've a pandas dataframe with columns,department and employee_count. I need to sort the employee_count column in descending order. But if there is a tie between 2 employee_counts then they should be sorted alphabetically based on department.
Department Employee_Count
0 abc 10
1 adc 10
2 bca 11
3 cde 9
4 xyz 15
required output:
Department Employee_Count
0 xyz 15
1 bca 11
2 abc 10
3 adc 10
4 cde 9
This is what I've tried.
df = df.sort_values(['Department','Employee_Count'],ascending=[True,False])
But this just sorts the departments alphabetically.
I've also tried to sort by Department first and then by Employee_Count. Like this:
df = df.sort_values(['Department'],ascending=[True])
df = df.sort_values(['Employee_Count'],ascending=[False])
This doesn't give me correct output either:
Department Employee_Count
4 xyz 15
2 bca 11
1 adc 10
0 abc 10
3 cde 9
It gives 'adc' first and then 'abc'. Kindly help me.
Sorting Your DataFrame on a Single Column. To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.
Pandas comes with a whole host of sql-like aggregation functions you can apply when grouping on one or more columns. This is Python's closest equivalent to dplyr's group_by + summarise logic.
To group Pandas dataframe, we use groupby(). To sort grouped dataframe in descending order, use sort_values(). The size() method is used to get the dataframe size.
All of the sorting methods available in Pandas fall under the following three categories: Sorting by index labels; Sorting by column values; Sorting by a combination of index labels and column values.
You can swap columns in list and also values in ascending
parameter:
Explanation:
Order of columns names is order of sorting, first sort descending by Employee_Count
and if some duplicates in Employee_Count
then sorting by Department
only duplicates rows ascending.
df1 = df.sort_values(['Employee_Count', 'Department'], ascending=[False, True])
print (df1)
Department Employee_Count
4 xyz 15
2 bca 11
0 abc 10 <-
1 adc 10 <-
3 cde 9
Or for test if use second False
then duplicated rows are sorting descending
:
df2 = df.sort_values(['Employee_Count', 'Department',],ascending=[False, False])
print (df2)
Department Employee_Count
4 xyz 15
2 bca 11
1 adc 10 <-
0 abc 10 <-
3 cde 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With