My Pandas data frame contains the following data:
product,values
 a1,     10
 a5,     20
 a10,    15
 a2,     45
 a3,     12
 a6,     67
I have to sort this data frame based on the product column. Thus, I would like to get the following output:
product,values
 a10,     15
 a6,      67
 a5,      20
 a3,      12
 a2,      45
 a1,      10
Unfortunately, I'm facing the following error:
ErrorDuringImport(path, sys.exc_info())
ErrorDuringImport: problem in views - type 'exceptions.Indentation
To sort a DataFrame based on column names we can call sort_index() on the DataFrame object with argument axis=1 i.e. As we can see, instead of modifying the original dataframe it returned a sorted copy of dataframe based on column names.
Practical Data Science using Python To group Pandas dataframe, we use groupby(). To sort grouped dataframe in ascending or descending order, use sort_values(). The size() method is used to get the dataframe size.
To sort a Pandas DataFrame by index, you can use DataFrame. sort_index() method. To specify whether the method has to sort the DataFrame in ascending or descending order of index, you can set the named boolean argument ascending to True or False respectively. When the index is sorted, respective rows are rearranged.
You can first extract digits and cast to int by astype. Then sort_values of column sort and last drop this column:
df['sort'] = df['product'].str.extract('(\d+)', expand=False).astype(int)
df.sort_values('sort',inplace=True, ascending=False)
df = df.drop('sort', axis=1)
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10
It is necessary, because if use only sort_values:
df.sort_values('product',inplace=True, ascending=False)
print (df)
  product  values
5      a6      67
1      a5      20
4      a3      12
3      a2      45
2     a10      15
0      a1      10
Another idea is use natsort library:
from natsort import index_natsorted, order_by_index
df = df.reindex(index=order_by_index(df.index, index_natsorted(df['product'], reverse=True)))
print (df)
  product  values
2     a10      15
5      a6      67
1      a5      20
4      a3      12
3      a2      45
0      a1      10
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With