Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort all columns of a pandas DataFrame independently using sort_values()

Tags:

python

pandas

I have a dataframe and want to sort all columns independently in descending or ascending order.

import pandas as pd

data = {'a': [5, 2, 3, 6],
        'b': [7, 9, 1, 4],
        'c': [1, 5, 4, 2]}
df = pd.DataFrame.from_dict(data)
   a  b  c
0  5  7  1
1  2  9  5
2  3  1  4
3  6  4  2

When I use sort_values() for this it does not work as expected (to me) and only sorts one column:

foo = df.sort_values(by=['a', 'b', 'c'], ascending=[False, False, False])
   a  b  c
3  6  4  2
0  5  7  1
2  3  1  4
1  2  9  5

I can get the desired result if I use the solution from this answer which applies a lambda function:

bar = df.apply(lambda x: x.sort_values().values)
print(bar)

   a  b  c
0  2  1  1
1  3  4  2
2  5  7  4
3  6  9  5

But this looks a bit heavy-handed to me.

What's actually happening in the sort_values() example above and how can I sort all columns in my dataframe in a pandas-way without the lambda function?

like image 681
Cord Kaldemeyer Avatar asked Dec 10 '22 12:12

Cord Kaldemeyer


1 Answers

You can use numpy.sort with DataFrame constructor:

df1 = pd.DataFrame(np.sort(df.values, axis=0), index=df.index, columns=df.columns)
print (df1)
   a  b  c
0  2  1  1
1  3  4  2
2  5  7  4
3  6  9  5

EDIT:

Answer with descending order:

arr = df.values
arr.sort(axis=0)
arr = arr[::-1]
print (arr)
[[6 9 5]
 [5 7 4]
 [3 4 2]
 [2 1 1]]

df1 = pd.DataFrame(arr, index=df.index, columns=df.columns)
print (df1)
   a  b  c
0  6  9  5
1  5  7  4
2  3  4  2
3  2  1  1
like image 164
jezrael Avatar answered Apr 05 '23 23:04

jezrael