Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dataframe checking if the string ends with a number and remove

Say I have a column foo in a dataframe df, which looks like:

0            abc1
1             def
2           g3sse1
3           f32asd

I do not want the number at the end, if there is any.

0             abc
1             def
2            g3sse
3           f32asd

Like this.

The best I can do is:

df.foo[df['foo'].str[-1].str.isdigit()] = df['foo'].str[:-1]

This solves the problem, but... I am curious if there is more elegant way to do this. I guess regex won't make it look any better, but I appreciate any ideas!

like image 490
Treeboy Avatar asked Feb 25 '26 19:02

Treeboy


1 Answers

Since your input only contains trailing numbers, and in this case you don't want to use regular expressions, you can also use rstrip and python's string module:

import string
df['foo_refined'] = df['foo'].str.rstrip(string.digits)

      foo foo_refined
0    abc1         abc
1     def         def
2  g3sse1       g3sse
3  f32asd      f32asd

a = '12a'
>>> a.rstrip(string.digits)
'12a'

b = '12a2'
>>> b.rstrip(string.digits)
'12a'

c = '12a12x'
>>> c.rstrip(string.digits)
'12a12x'

d = '123'
>>> d.rstrip(string.digits)
''

And a reference to lstrip, which as expected would strip any digits from the start not from the end if used in this context.

like image 164
sophocles Avatar answered Feb 28 '26 10:02

sophocles