Python regex findall numbers and dots

Question

I'm using re.findall() to extract some version numbers from an HTML file:

>>> import re
>>> text = "<table><td><a href=\"url\">Test0.2.1.zip</a></td><td>Test0.2.1</td></table> Test0.2.1"
>>> re.findall("Test([\.0-9]*)", text)
['0.2.1.', '0.2.1', '0.2.1']

but I would like to only get the ones that do not end in a dot. The filename might not always be .zip so I can't just stick .zip in the regex.

I wanna end up with:

['0.2.1', '0.2.1']

Can anyone suggest a better regex to use? :)

Tomalak · Accepted Answer

re.findall(r"Test([0-9.]*[0-9]+)", text)

or, a bit shorter:

re.findall(r"Test([\d.]*\d+)", text)

By the way - you do not need to escape the dot in a character class. Inside [] the . has no special meaning, it just matches a literal dot. Escaping it has no effect.

Python regex findall numbers and dots

Tags:

python

regex

findall

Ashy

1 Answers

Tomalak

Recent Activity

Donate For Us

Python regex findall numbers and dots

Tags:

python

regex

findall

Ashy

1 Answers

Tomalak

Related questions

Recent Activity

Donate For Us