Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Regex remove numbers and numbers with punctaution

Tags:

python

regex

I have the following string

 line = "1234567 7852853427.111 https://en.wikipedia.org/wiki/Dictionary_(disambiguation)"

I would like to remove the numbers 1234567 7852853427.111 using regular expresisions

I have this re

nline = re.sub("^\d+\s|\s\d+\s|\s\d\w\d|\s\d+$", " ", line)

but it is not doing what i hoped it would be doing.

Can anyone point me in the right direction?

like image 294
Morpheus Avatar asked Mar 12 '23 02:03

Morpheus


2 Answers

You can use:

>>> line = "1234567 7852853427.111 https://en.wikipedia.org/wiki/Dictionary_(disambiguation)" 
>>> print re.sub(r'\b\d+(?:\.\d+)?\s+', '', line)

https://en.wikipedia.org/wiki/Dictionary_(disambiguation)

Regex \b\d+(?:\.\d+)?\s+ will match an integer or decimal number followed by 1 or more spaces. \b is for word boundary.

like image 164
anubhava Avatar answered Mar 20 '23 12:03

anubhava


Here's a non-regex approach, if your regex requirement is not entirely strict, using itertools.dropwhile:

>>> ''.join(dropwhile(lambda x: not x.isalpha(), line))
'https://en.wikipedia.org/wiki/Dictionary_(disambiguation)'
like image 26
Moses Koledoye Avatar answered Mar 20 '23 11:03

Moses Koledoye