To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order. It does not modify the original DataFrame.
Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.
[updated to simplify]
tl;dr:
In [29]: new_columns = df.columns[df.ix[df.last_valid_index()].argsort()]
In [30]: df[new_columns]
Out[30]:
aaa ppp fff ddd
0 0.328281 0.375458 1.188905 0.503059
1 0.305457 0.186163 0.077681 -0.543215
2 0.684265 0.681724 0.210636 -0.532685
3 -1.134292 1.832272 0.067946 0.250131
4 -0.834393 0.010211 0.649963 -0.551448
5 -1.032405 -0.749949 0.442398 1.274599
Some explanation follows. First, build the DataFrame
:
In [24]: df = pd.DataFrame(np.random.randn(6, 4), columns=['ddd', 'fff', 'aaa', 'ppp'])
In [25]: df
Out[25]:
ddd fff aaa ppp
0 0.503059 1.188905 0.328281 0.375458
1 -0.543215 0.077681 0.305457 0.186163
2 -0.532685 0.210636 0.684265 0.681724
3 0.250131 0.067946 -1.134292 1.832272
4 -0.551448 0.649963 -0.834393 0.010211
5 1.274599 0.442398 -1.032405 -0.749949
Get the last row:
In [26]: last_row = df.ix[df.last_valid_index()]
Get the indices that would sort it:
In [27]: last_row.argsort()
Out[27]:
ddd 2
fff 3
aaa 1
ppp 0
Name: 5, Dtype: int32
Use this to index df
:
In [28]: df[last_row.argsort()]
Out[28]:
aaa ppp fff ddd
0 0.328281 0.375458 1.188905 0.503059
1 0.305457 0.186163 0.077681 -0.543215
2 0.684265 0.681724 0.210636 -0.532685
3 -1.134292 1.832272 0.067946 0.250131
4 -0.834393 0.010211 0.649963 -0.551448
5 -1.032405 -0.749949 0.442398 1.274599
Profit!
The sort_values
method does this directly when given axis=1
argument.
sorted_df = df.sort_values(df.last_valid_index(), axis=1)
So, it is no longer necessary to transpose the dataframe to sort by a row. Also, the sort
method is now deprecated.
I would use transpose and the sort method (which works on columns):
df = pd.DataFrame(np.random.randn(10, 4), columns=['ddd', 'fff', 'aaa', 'ppp'])
last_row_name = df.index[-1]
sorted_df = df.T.sort(columns=last_row_name).T
You might suffer a performance hit but it is quick and easy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With