Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Python regex for number with or without decimals using a dot or comma as separator?




I'm just learning regex and now I'm trying to match a number which more or less represents this:

[zero or more numbers][possibly a dot or comma][zero or more numbers]

No dot or comma is also okay. So it should match the following:

123,  # From here it's the same but with commas instead of dot separators

But it should not match the following:

100,000.99  # I know this and the one below are valid in many languages, but I simply want to reject these

So far I've come up with [0-9]*[.,][0-9]*, but it doesn't seem to work so well:

>>> import re
>>> r = re.compile("[0-9]*[.,][0-9]*")
>>> if r.match('0.1.'): print 'it matches!'
it matches!
>>> if r.match('0.abc'): print 'it matches!'
it matches!

I have the feeling I'm doing two things wrong: I don't use match correctly AND my regex is not correct. Could anybody enlighten me on what I'm doing wrong? All tips are welcome!

like image 291
kramer65 Avatar asked Oct 01 '14 09:10


People also ask

How do you use commas in regex?

The 0-9 indicates characters 0 through 9, the comma , indicates comma, and the semicolon indicates a ; . The closing ] indicates the end of the character set. The plus + indicates that one or more of the "previous item" must be present.

How do you match a dot in regex?

in regex is a metacharacter, it is used to match any character. To match a literal dot in a raw Python string ( r"" or r'' ), you need to escape it, so r"\." Unless the regular expression is stored inside a regular python string, in which case you need to use a double \ ( \\ ) instead.

How do you exclude decimals in Python?

Using int() method To remove the decimal from a number, we can use the int() method in Python. The int() method takes the number as an argument and returns the integer by removing the decimal part from it. It can be also used with negative numbers.

Can you use regex with numbers?

The regex [0-9] matches single-digit numbers 0 to 9. [1-9][0-9] matches double-digit numbers 10 to 99. That's the easy part. Matching the three-digit numbers is a little more complicated, since we need to exclude numbers 256 through 999.

2 Answers

You need to make [.,] part as optional by adding ? after that character class and also don't forget to add anchors. ^ asserts that we are at the start and $ asserts that we are at the end.



>>> import re
>>> r = re.compile(r"^\d*[.,]?\d*$")
>>> if r.match('0.1.'): print 'it matches!'
>>> if r.match('0.abc'): print 'it matches!'
>>> if r.match('0.'): print 'it matches!'
it matches!

If you don't want to allow a single comma or dot then use a lookahead.



like image 196
Avinash Raj Avatar answered Sep 22 '22 17:09

Avinash Raj

Your regex would work fine if you just add the ^ at the front and the $ at the back so that system knows how your string would begin and end.

Try this


import re

checklist = ['1', '123', '123.', '123.4', '123.456', '.456', '123,', '123,4', '123,456', ',456', '0.,1', '0a,1', '0..1', '1.1.2', '100,000.99', '100.000,99', '0.1.', '0.abc']

pat = re.compile(r'^[0-9]*[.,]{0,1}[0-9]*$')

for c in checklist:
   if pat.match(c):
      print '%s : it matches' % (c)
      print '%s : it does not match' % (c)

1 : it matches
123 : it matches
123. : it matches
123.4 : it matches
123.456 : it matches
.456 : it matches
123, : it matches
123,4 : it matches
123,456 : it matches
,456 : it matches
0.,1 : it does not match
0a,1 : it does not match
0..1 : it does not match
1.1.2 : it does not match
100,000.99 : it does not match
100.000,99 : it does not match
0.1. : it does not match
0.abc : it does not match
like image 26
thisisshantzz Avatar answered Sep 20 '22 17:09
