If I had a sentence that has an age and a time :
import re
text = "I am 21 and work at 3:30"
answer= re.findall(r'\b\d{2}\b', text)
print(answer)
The issue is that it gives me not only the 21, but 30 (since it looks for 2 digits). How do I avoid this so it will only count the numbers and not the non-alphanumeric characters that leads to the issue? I tried to use [0-99] instead of the {} braces but that didn't seem to help.
Using \s\d{2}\s will give you only 2 digit combinations with spaces around them (before and after).
Or if you want to match without trailing whitespace: \s\d{2}
Thats because : is considered as non-word constituent character when you match empty string at word boundary with \b. In Regex term, a word for \b is \w+.
You can check for digits with space or start/end of input line around:
(?:^|\s)(\d{2})(?:\s|$)
Example:
In [85]: text = "I am 21 and work at 3:30"
...: re.findall(r'(?:^|\s)(\d{2})(?:\s|$)', text)
Out[85]: ['21']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With