Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slicing Dataframe column based on length of strings

Tags:

python

pandas

I would like to remove the first 3 characters from strings in a Dataframe column where the length of the string is > 4

If else they should remain the same.

E.g

bloomberg_ticker_y

AIM9
DJEM9 # (should be M9)
FAM9
IXPM9 # (should be M9)

I can filter the strings by length:

merged['bloomberg_ticker_y'].str.len() > 4

and slice the strings:

merged['bloomberg_ticker_y'].str[-2:]

But not sure how to put this together and apply it to my dataframe

Any help would be appreciated.

like image 905
User63164 Avatar asked Jul 01 '19 14:07

User63164


1 Answers

You can use a list comprehension :

df = pd.DataFrame({'bloomberg_ticker_y' : ['AIM9', 'DJEM9', 'FAM9', 'IXPM9']})

df['new'] = [x[-2:] if len(x)>4 else x for x in df['bloomberg_ticker_y']]

Output :

  bloomberg_ticker_y   new
0               AIM9  AIM9
1              DJEM9    M9
2               FAM9  FAM9
3              IXPM9    M9
like image 128
vlemaistre Avatar answered Sep 21 '22 08:09

vlemaistre