I have the following string
line = "1234567 7852853427.111 https://en.wikipedia.org/wiki/Dictionary_(disambiguation)"
I would like to remove the numbers 1234567 7852853427.111 using regular expresisions
I have this re
nline = re.sub("^\d+\s|\s\d+\s|\s\d\w\d|\s\d+$", " ", line)
but it is not doing what i hoped it would be doing.
Can anyone point me in the right direction?
You can use:
>>> line = "1234567 7852853427.111 https://en.wikipedia.org/wiki/Dictionary_(disambiguation)"
>>> print re.sub(r'\b\d+(?:\.\d+)?\s+', '', line)
https://en.wikipedia.org/wiki/Dictionary_(disambiguation)
Regex \b\d+(?:\.\d+)?\s+
will match an integer or decimal number followed by 1 or more spaces. \b
is for word boundary.
Here's a non-regex approach, if your regex requirement is not entirely strict, using itertools.dropwhile
:
>>> ''.join(dropwhile(lambda x: not x.isalpha(), line))
'https://en.wikipedia.org/wiki/Dictionary_(disambiguation)'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With