I am trying to filter a list of strings with regular expressions, as shown in this answer. However the code gives an unexpected result:
In [123]: r = re.compile('[0-9]*')
In [124]: string_list = ['123', 'a', '467','a2_2','322','21']
In [125]: filter(r.match, string_list)
Out[125]: ['123', 'a', '467', 'a2_2', '322_2', '21']
I expected the output to be ['123', '467', '21']
.
The problem is that your pattern contains the *
, quantifier, will match zero or more digits. So even if the string doesn't contain a digit at all, it will match the pattern. Furthermore, your pattern will match digits wherever they occur in the input string, meaning, a2
is still a valid match because it contains a digit.
Try using this pattern
^[0-9]+$
Or more simply:
^\d+$
This will match one or more digits. The start (^
) and end ($
) anchors ensure that no other characters will be allowed within the string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With