Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between Series.sort() and Series.order()?

Tags:

python

pandas

s = pd.Series( nr.randint( 0, 10, 5 ), index=nr.randint(0, 10, 5 ) )
s

Output

1    3
7    6
2    0
9    7
1    6

order() sorts by value and returns a new Series

s.order()

Output

2    0
1    3
7    6
1    6
9    7

It looks like sort also sorts by value, but in place:

s.sort()
s

Output

2    0
1    3
7    6
1    6
9    7

Is this the only difference between the two methods?

like image 595
usual me Avatar asked May 16 '14 13:05

usual me


People also ask

What is the difference between order and sort?

Among the applicable definitions found were the following: SORT: To arrange (things, etc.) according to a kind or quality, or after some settled order or system; to separate and put into different sorts or classes. ORDER: The action of putting or keeping in order.

Which method is used to sort a series in pandas?

For sorting a pandas series the Series. sort_values() method is used. Examples 1: Sorting a numeric series in ascending order.

Which function is used to sort the series values?

sort_values() function is used to sort the given series object in ascending or descending order by some criterion.

How do you sort Series by index?

The sort_index() function is used to sort Series by index labels. Returns a new Series sorted by label if inplace argument is False, otherwise updates the original series and returns None. Axis to direct sorting. This can only be 0 for Series.


2 Answers

Your Question: Is this (Series.sort in-place v.s. Series.order return-new-obj) the only difference between the two methods?

BEFORE pandas 0.17.0 Final release (i.e. before 2015-10-09)

Short Answer: YES. They are functionally equivalent.

Longer Answer:

pandas.Series.sort(): change the object itself (in-place sorting), but returns nothing.

Sort values and index labels by value. This is an inplace sort by default. Series.order is the equivalent but returns a new Series.

So

>>> s = pd.Series([3,4,0,3]).sort()
>>> s

outputs nothing. See the answer here for more details.

pandas.Series.order(): dose not change the object, instead it returns a new sorted object.

Sorts Series object, by value, maintaining index-value link. This will return a new Series by default. Series.sort is the equivalent but as an inplace method.


AFTER pandas 0.17.0 Final release (i.e. after 2015-10-09)

The API of sorting is changed, things became cleaner and more pleasant.

To sort by the values, both Series.sort() and Series.order() are DEPRECATED, replaced by the new Series.sort_values() api, which returns a sorted Series object.

To summary the changes (excerpt from pandas 0.17.0 doc):

To sort by the values (A * marks items that will show a FutureWarning):

        Previous              |         Replacement
------------------------------|-----------------------------------
* Series.order()              |  Series.sort_values()
* Series.sort()               |  Series.sort_values(inplace=True)
* DataFrame.sort(columns=...) |  DataFrame.sort_values(by=...) 
like image 119
YaOzI Avatar answered Oct 10 '22 10:10

YaOzI


Looking at the pandas source code (and skipping out the docstring)

def sort(self, axis=0, ascending=True, kind='quicksort', na_position='last', inplace=True):
        return self.order(ascending=ascending,
                          kind=kind,
                          na_position=na_position,
                          inplace=inplace)

Compare this with the declaring line of order (I'm using 0.14.1)

def order(self, na_last=None, ascending=True, kind='quicksort', na_position='last', inplace=False)

You can see that since sort calls the order function the two are for all intents and purposes identical under the hood other than their default parameters.

As noted in the question, the default values of the inplace parameter for sort inplace = True and order inplace = False are different but there is no other difference in behaviour.

The other only other difference is that order has an additional (but deprecated) parameter in the form of na_last which you cannot use with sort (and shouldn't be using anyway).

like image 44
undershock Avatar answered Oct 10 '22 09:10

undershock