I have been trying to teach myself Python and am currently on regular expressions. The instructional text I have been using seems to be aimed at teaching Perl or some other language that is not Python, so I have had to adapt the expressions a bit to fit Python. I'm not very experienced, however, and I've hit a snag trying to get an expression to work.
The problem involves searching a text for instances of prices, expressed either without decimals, $500, or with decimals, $500.10.
This is what the text recommends:
\$[0-9]+(\.[0-9][0-9])?
Replicating the text, I use this code:
import re
inputstring = "$500.01"
result = re.findall( r'\$[0-9]+(\.[0-9][0-9])?', inputstring)
if result:
print(result)
else:
print("No match.")
However, the result is not $500.01, but rather:
.01
I find this strange. If I remove the parentheses and the optional decimal portion, it works fine. So, using this:
\$[0-9]+\.[0-9][0-9]
I get:
$500.01
How can I get the regular expression to return values with and without decimal portions?
Thanks.
Use a non-capturing group:
result = re.findall( r'\$[0-9]+(?:\.[0-9][0-9])?', inputstring)
^^
The re.findall
function returns the list of captured texts if there are any defined in the pattern, and you have one in yours. You need to get rid of it by turning it into a non-capturing one.
re.findall(pattern, string, flags=0)
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.
Update
You can shorten your regex a bit by using a limiting quantifier {2}
that requires exactly 2 occurrences of the preceding subpattern:
r'\$[0-9]+(?:\.[0-9]{2})?'
^^^
Or even replace [0-9]
with \d
:
r'\$\d+(?:\.\d{2})?'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With