'DataFrame' object has no attribute 'sort'

Pandas Sorting 101

sort has been replaced in v0.20 by DataFrame.sort_values and DataFrame.sort_index. Aside from this, we also have argsort.

Here are some common use cases in sorting, and how to solve them using the sorting functions in the current API. First, the setup.

# Setup
np.random.seed(0)
df = pd.DataFrame({'A': list('accab'), 'B': np.random.choice(10, 5)})    
df                                                                                                                                        
   A  B
0  a  7
1  c  9
2  c  3
3  a  5
4  b  2

Sort by Single Column

For example, to sort df by column "A", use sort_values with a single column name:

df.sort_values(by='A')

   A  B
0  a  7
3  a  5
4  b  2
1  c  9
2  c  3

If you need a fresh RangeIndex, use DataFrame.reset_index.

Sort by Multiple Columns

For example, to sort by both col "A" and "B" in df, you can pass a list to sort_values:

df.sort_values(by=['A', 'B'])

   A  B
3  a  5
0  a  7
4  b  2
2  c  3
1  c  9

Sort By DataFrame Index

df2 = df.sample(frac=1)
df2

   A  B
1  c  9
0  a  7
2  c  3
3  a  5
4  b  2

You can do this using sort_index:

df2.sort_index()

   A  B
0  a  7
1  c  9
2  c  3
3  a  5
4  b  2

df.equals(df2)                                                                                                                            
# False
df.equals(df2.sort_index())                                                                                                               
# True

Here are some comparable methods with their performance:

%timeit df2.sort_index()                                                                                                                  
%timeit df2.iloc[df2.index.argsort()]                                                                                                     
%timeit df2.reindex(np.sort(df2.index))                                                                                                   

605 µs ± 13.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
610 µs ± 24.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
581 µs ± 7.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Sort by List of Indices

For example,

idx = df2.index.argsort()
idx
# array([0, 7, 2, 3, 9, 4, 5, 6, 8, 1])

This "sorting" problem is actually a simple indexing problem. Just passing integer labels to iloc will do.

df.iloc[idx]

   A  B
1  c  9
0  a  7
2  c  3
3  a  5
4  b  2

Related questions
                            
                                correct way to use super (argument passing)
                            
                                Django using get_user_model vs settings.AUTH_USER_MODEL
                            
                                Subclass in type hinting
                            
                                How to use multiprocessing queue in Python?
                            
                                How to write header row with csv.DictWriter?
                            
                                Getting "Permission Denied" when running pip as root on my Mac
                            
                                pypi UserWarning: Unknown distribution option: 'install_requires'
                            
                                linux tee is not working with python?
                            
                                How to save a new sheet in an existing excel file, using Pandas?
                            
                                What algorithm does python's sorted() use? [duplicate]
                            
                                raw_input function in Python
                            
                                How to make two plots side-by-side using Python?
                            
                                Filtering a list of strings based on contents
                            
                                ImportError in importing from sklearn: cannot import name check_build
                            
                                binning data in python with scipy/numpy
                            
                                Strip spaces/tabs/newlines - python
                            
                                Python - 'ascii' codec can't decode byte
                            
                                Add column with number of days between dates in DataFrame pandas
                            
                                Static methods - How to call a method from another method?
                            
                                How to get value from form field in django framework?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

'DataFrame' object has no attribute 'sort'

Tags:

python

pandas

dataframe

numpy

People also ask