Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort dataframe by string length

I want to sort by name length. There doesn't appear to be a key parameter for sort_values so I'm not sure how to accomplish this. Here is a test df:

import pandas as pd df = pd.DataFrame({'name': ['Steve', 'Al', 'Markus', 'Greg'], 'score': [2, 4, 2, 3]}) 
like image 477
AlexG Avatar asked Feb 28 '17 18:02

AlexG


1 Answers

You can use reindex of index of Series created by len with sort_values:

print (df.name.str.len()) 0    5 1    2 2    6 3    4 Name: name, dtype: int64  print (df.name.str.len().sort_values()) 1    2 3    4 0    5 2    6 Name: name, dtype: int64  s = df.name.str.len().sort_values().index print (s) Int64Index([1, 3, 0, 2], dtype='int64')  print (df.reindex(s))      name  score 1      Al      4 3    Greg      3 0   Steve      2 2  Markus      2 

df1 = df.reindex(s) df1 = df1.reset_index(drop=True) print (df1)      name  score 0      Al      4 1    Greg      3 2   Steve      2 3  Markus      2 
like image 174
jezrael Avatar answered Sep 19 '22 04:09

jezrael