Python Pandas provides two methods for sorting DataFrame :
What are differences between these two methods ?
The sort_index() function is used to sort Series by index labels. Returns a new Series sorted by label if inplace argument is False, otherwise updates the original series and returns None.
When it comes to selecting rows and columns of a pandas DataFrame, loc and iloc are two commonly used functions. Here is the subtle difference between the two functions: loc selects rows and columns with specific labels. iloc selects rows and columns at specific integer positions.
To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.
To sort by index / columns (row/column names), use the sort_index() method.
As the question was updated to ask for the difference between sort_values
(as sort
is deprecated) and sort_index
, the answer of @mathdan is no longer reflecting the current state with the latest pandas version (>= 0.17.0).
sort_values
is meant to sort by the values of columns sort_index
is meant to sort by the index labels (or a specific level of the index, or the column labels when axis=1
)Previously, sort
(deprecated starting from pandas 0.17.0) and sort_index
where indeed almost identical (both methods could sort by both columns and index). But this confusing situation has been solved in 0.17.0.
For an overview of the changes in the sorting API, see http://pandas.pydata.org/pandas-docs/stable/whatsnew/v0.17.0.html#changes-to-sorting-api
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With