I just started looking into regular expressions, and was wondering what the difference is between the following:
def test():
string = "He was 75 in the 1985sdfdhs 45"
y = re.findall('[0-9]+', string)
print(y)
test()
and this
def test2():
string = "He was 75 in the 1985sdfdhs 45"
y = re.findall('[0-9.]+', string)
print(y)
test2()
To my understanding the "." matches any character, so I would have thought the output for test2 would equal ['75', '1985sdfdhs', '45'], instead they are both ['75', '1985', '45']. Just trying to figure out whats going on here. Thanks.
When within the brackets [ and ], the dot is considered a character by itself. So the second regex will match 0-9 as well as the decimal dot. The brackets denote a character set and will match one of the characters in the set (which is why the . is considered a character and not a specifier).
EDIT: As an additional note, while you're learning about RegEx, I recommend https://regex101.com/ which will break down each part of the RegEx for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With