Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove words containing only numbers in python?

I have some text in Python which is composed of numbers and alphabets. Something like this:

s = "12 word word2"

From the string s, I want to remove all the words containing only numbers

So I want the result to be

s = "word word2"

This is a regex I have but it works on alphabets i.e. it replaces each alphabet by a space.

re.sub('[\ 0-9\ ]+', ' ', line)

Can someone help in telling me what is wrong? Also, is there a more time-efficient way to do this than regex?

Thanks!

like image 822
silent_dev Avatar asked Dec 02 '22 14:12

silent_dev


2 Answers

You can use this regex:

>>> s = "12 word word2"
>>> print re.sub(r'\b[0-9]+\b\s*', '', s)
word word2

\b is used for word boundary and \s* will remove 0 or more spaces after your number word.

like image 91
anubhava Avatar answered Dec 19 '22 08:12

anubhava


Using a regex is probably a bit overkill here depending whether you need to preserve whitespace:

s = "12 word word2"
s2 = ' '.join(word for word in s.split() if not word.isdigit())
# 'word word2'
like image 36
Jon Clements Avatar answered Dec 19 '22 08:12

Jon Clements