Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex a decimal number with comma

Tags:

regex

I'm heaving trouble finding the right regex for decimal numbers which include the comma separator.

I did find a few other questions regarding this issue in general but none of the answers really worked when I tested them

The best I got so far is:

[0-9]{1,3}(,([0-9]{3}))*(.[0-9]+)?

2 main problems so far:

1) It records numbers with spaces between them "3001 1" instead of splitting them to 2 matches "3001" "1" - I don't really see where I allowed space in the regex.

2) I have a general problem with the beginning\ending of the regex.

The regex should match:

3,001
1
32,012,111.2131 

But not:

32,012,11.2131
1132,012,111.2131
32,0112,111.2131
32131

In addition I'd like it to match:

1.(without any number after it)
1,(without any number after it)
as 1

(a comma or point at the end of the number should be overlooked).

Many Thanks! .

like image 993
LiranBo Avatar asked Nov 22 '13 20:11

LiranBo


People also ask

How do you match a comma in regex?

The 0-9 indicates characters 0 through 9, the comma , indicates comma, and the semicolon indicates a ; . The closing ] indicates the end of the character set. The plus + indicates that one or more of the "previous item" must be present.


2 Answers

This is a very long and convoluted regular expression that fits all your requirements. It will work if your regex engine is based on PCRE (hopefully you're using PHP, Delphi or R..).

(?<=[^\d,.]|^)\d{1,3}(,(\d{3}))*((?=[,.](\s|$))|(\.\d+)?(?=[^\d,.]|$))

DEMO on RegExr

The things that make it so long:

  1. Matching multiple numbers on the same line separated by only 1 character (a space) whilst not allowing partial matchs requires a lookahead and a lookbehind.
  2. Matching numbers ending with . and , without including the . or , in the match requires another lookahead.

(?=[,.](\s|$)) Explanation

When writing this explanation I realised the \s needs to be a (\s|$) to match 1, at the very end of a string.

This part of the regex is for matching the 1 in 1, or the 1,000 in 1,000. so let's say our number is 1,000. (with the . on the end).

Up to this point the regex has matched 1,000, then it can't find another , to repeat the thousands group so it moves on to our (?=[,.](\s|$))

(?=....) means its a lookahead, that means from where we have matched up to, look at whats coming but don't add it to the match.

So It checks if there is a , or a . and if there is, it checks that it's immediately followed by whitespace or the end of input. In this case it is, so it'd leave the match as 1,000

Had the lookahead not matched, it would have moved on to trying to match decimal places.

like image 87
OGHaza Avatar answered Oct 14 '22 10:10

OGHaza


This works for all the ones that you have listed

^[0-9]{1,3}(,[0-9]{3})*(([\\.,]{1}[0-9]*)|())$
like image 1
Adarsh Avatar answered Oct 14 '22 09:10

Adarsh