I have a dataframe with many columns of data and different types. I have encountered one column that has String and Integers contained within it. I am trying to find the values with the longest/shortest length (note not largest value). (NOTE: The eg I am using below only has integers in it because I couldn't work out how to mix dtypes and still call this an int64 column)
Name MixedField
a david 32252
b andrew 4023
c calvin 25
d david 2
e calvin 522
f david 35
The method I am using is to convert the df column to a String Series (because they might be double/int/string/combinations), and then I can get the max/min length items from this series:
df['MixedField'].apply(str).map(len).max()
df['MixedField'].apply(str).map(len).min()
But can't work out how to select the actual values that are the maximum and minimum length!?! (ie 32252 (longest) and 2 (shortest)
(I possibly don't need to explain this, but there is a subtle difference between largest and longest...(ie "aa" is longer than "z")). Appreciate your help. Thanks.
I think this should work if you df
have unique indices.
field_length = df.MixedField.astype(str).map(len)
print df.loc[field_length.argmax(), 'MixedField']
print df.loc[field_length.argmin(), 'MixedField']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With