Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the longest length string/integer/float from a pandas column when there are strings in the column

Tags:

python

pandas

I have a dataframe with many columns of data and different types. I have encountered one column that has String and Integers contained within it. I am trying to find the values with the longest/shortest length (note not largest value). (NOTE: The eg I am using below only has integers in it because I couldn't work out how to mix dtypes and still call this an int64 column)

    Name    MixedField
a   david   32252
b   andrew  4023
c   calvin  25
d   david   2
e   calvin  522
f   david   35

The method I am using is to convert the df column to a String Series (because they might be double/int/string/combinations), and then I can get the max/min length items from this series:

df['MixedField'].apply(str).map(len).max()
df['MixedField'].apply(str).map(len).min()

But can't work out how to select the actual values that are the maximum and minimum length!?! (ie 32252 (longest) and 2 (shortest)

(I possibly don't need to explain this, but there is a subtle difference between largest and longest...(ie "aa" is longer than "z")). Appreciate your help. Thanks.

like image 970
Calamari Avatar asked Jan 13 '16 03:01

Calamari


1 Answers

I think this should work if you df have unique indices.

field_length = df.MixedField.astype(str).map(len)
print df.loc[field_length.argmax(), 'MixedField']
print df.loc[field_length.argmin(), 'MixedField']
like image 76
Happy001 Avatar answered Oct 14 '22 03:10

Happy001