Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex returns a part of the match when used with re.findall

I have been trying to teach myself Python and am currently on regular expressions. The instructional text I have been using seems to be aimed at teaching Perl or some other language that is not Python, so I have had to adapt the expressions a bit to fit Python. I'm not very experienced, however, and I've hit a snag trying to get an expression to work.

The problem involves searching a text for instances of prices, expressed either without decimals, $500, or with decimals, $500.10.

This is what the text recommends:

\$[0-9]+(\.[0-9][0-9])?

Replicating the text, I use this code:

import re

inputstring = "$500.01"

result = re.findall( r'\$[0-9]+(\.[0-9][0-9])?', inputstring)

if result:
    print(result)
else:
    print("No match.")

However, the result is not $500.01, but rather:

.01

I find this strange. If I remove the parentheses and the optional decimal portion, it works fine. So, using this:

\$[0-9]+\.[0-9][0-9]

I get:

$500.01

How can I get the regular expression to return values with and without decimal portions?

Thanks.

like image 918
Jordan H. Avatar asked Sep 26 '22 23:09

Jordan H.


1 Answers

Use a non-capturing group:

result = re.findall( r'\$[0-9]+(?:\.[0-9][0-9])?', inputstring)
                                ^^ 

The re.findall function returns the list of captured texts if there are any defined in the pattern, and you have one in yours. You need to get rid of it by turning it into a non-capturing one.

re.findall(pattern, string, flags=0)
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.

Update

You can shorten your regex a bit by using a limiting quantifier {2} that requires exactly 2 occurrences of the preceding subpattern:

r'\$[0-9]+(?:\.[0-9]{2})?'
                    ^^^

Or even replace [0-9] with \d:

r'\$\d+(?:\.\d{2})?'
like image 70
Wiktor Stribiżew Avatar answered Oct 11 '22 20:10

Wiktor Stribiżew