Say I have a column foo in a dataframe df, which looks like:
0 abc1
1 def
2 g3sse1
3 f32asd
I do not want the number at the end, if there is any.
0 abc
1 def
2 g3sse
3 f32asd
Like this.
The best I can do is:
df.foo[df['foo'].str[-1].str.isdigit()] = df['foo'].str[:-1]
This solves the problem, but... I am curious if there is more elegant way to do this. I guess regex won't make it look any better, but I appreciate any ideas!
Since your input only contains trailing numbers, and in this case you don't want to use regular expressions, you can also use rstrip and python's string module:
import string
df['foo_refined'] = df['foo'].str.rstrip(string.digits)
foo foo_refined
0 abc1 abc
1 def def
2 g3sse1 g3sse
3 f32asd f32asd
a = '12a'
>>> a.rstrip(string.digits)
'12a'
b = '12a2'
>>> b.rstrip(string.digits)
'12a'
c = '12a12x'
>>> c.rstrip(string.digits)
'12a12x'
d = '123'
>>> d.rstrip(string.digits)
''
And a reference to lstrip, which as expected would strip any digits from the start not from the end if used in this context.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With