Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort_values() got an unexpected keyword argument 'by'

for i in str_list:   #str_list is a set contain some strings 
    df.loc[i].sort_values(by = 'XXX')
**TypeError**: sort_values() got an unexpected keyword argument 'by' ".
>>> type(df.loc[i])
>>> pandas.core.frame.DataFrame

But it works outside the for loop!

df.loc['string'].sort_values(by = 'XXX')
>>> type(df.loc['string'])
>>> pandas.core.frame.DataFrame

I'm confused.

like image 944
sakimarquis Avatar asked Jun 05 '18 10:06

sakimarquis


2 Answers

This is because the result of the loc operator is a pandas.Series object in your case. The sort_values in this case doesn't have a keyword argument by because it can only sort the series values. Have a look at the difference in the signature when you call sort values in a pandas.DataFrame

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html

and when you call sort_values in a pandas.Series

http://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.Series.sort_values.html

like image 194
kosnik Avatar answered Nov 01 '22 11:11

kosnik


To add to the answer, why is it returning a series in one case and a data frame in another?

.loc function is returning a Series in the first case

for i in str_list: #str_list is a set contain some strings

df.loc[i].sort_values(by = 'XXX')

because the argument i appears only once in the DataFrame.

in the second case, the 'string' is duplicated and therefore will return a DataFrame.

df.loc['string'].sort_values(by = 'XXX')

If the 'string' argument is not duplicated then note that there are also some differences if the argument in .loc is on a list. for example.

df.loc['string'] -> returns a Series

df.loc[['string']] -> returns a DataFrame

Maybe in the second case you are giving ['string'] as the argument instead of 'string' ?

Hope this helps.

like image 41
Carl Kristensen Avatar answered Nov 01 '22 12:11

Carl Kristensen