I need a regex to get numeric values that can be
111.111,11
111,111.11
111,111
And separate the integer and decimal portions so I can store in a DB with the correct syntax
I tried ([0-9]{1,3}[,.]?)+([,.][0-9]{2})?
With no success since it doesn't detect the second part :(
The result should look like:
111.111,11 -> $1 = 111111; $2 = 11
[0-9]+|[0-9]+). This regular expression matches an optional sign, that is either followed by zero or more digits followed by a dot and one or more digits (a floating point number with optional integer part), or that is followed by one or more digits (an integer).
The regex [0-9] matches single-digit numbers 0 to 9. [1-9][0-9] matches double-digit numbers 10 to 99. That's the easy part. Matching the three-digit numbers is a little more complicated, since we need to exclude numbers 256 through 999.
To match any number from 0 to 9 we use \d in regex. It will match any single digit number from 0 to 9. \d means [0-9] or match any number from 0 to 9. Instead of writing 0123456789 the shorthand version is [0-9] where [] is used for character range.
identifier = letter (letter | digit)* real-numeral = digit digit* .
First Answer:
This matches #,###,##0.00
:
^[+-]?[0-9]{1,3}(?:\,?[0-9]{3})*(?:\.[0-9]{2})?$
And this matches #.###.##0,00
:
^[+-]?[0-9]{1,3}(?:\.?[0-9]{3})*(?:\,[0-9]{2})?$
Joining the two (there are smarter/shorter ways to write it, but it works):
(?:^[+-]?[0-9]{1,3}(?:\,?[0-9]{3})*(?:\.[0-9]{2})?$)
|(?:^[+-]?[0-9]{1,3}(?:\.?[0-9]{3})*(?:\,[0-9]{2})?$)
You can also, add a capturing group to the last comma (or dot) to check which one was used.
Second Answer:
As pointed by Alan M, my previous solution could fail to reject a value like 11,111111.00
where a comma is missing, but the other isn't. After some tests I reached the following regex that avoids this problem:
^[+-]?[0-9]{1,3}
(?:(?<comma>\,?)[0-9]{3})?
(?:\k<comma>[0-9]{3})*
(?:\.[0-9]{2})?$
This deserves some explanation:
^[+-]?[0-9]{1,3}
matches the first (1 to 3) digits;
(?:(?<comma>\,?)[0-9]{3})?
matches on optional comma followed by more 3 digits, and captures the comma (or the inexistence of one) in a group called 'comma';
(?:\k<comma>[0-9]{3})*
matches zero-to-any repetitions of the comma used before (if any) followed by 3 digits;
(?:\.[0-9]{2})?$
matches optional "cents" at the end of the string.
Of course, that will only cover #,###,##0.00
(not #.###.##0,00
), but you can always join the regexes like I did above.
Final Answer:
Now, a complete solution. Indentations and line breaks are there for readability only.
^[+-]?[0-9]{1,3}
(?:
(?:\,[0-9]{3})*
(?:.[0-9]{2})?
|
(?:\.[0-9]{3})*
(?:\,[0-9]{2})?
|
[0-9]*
(?:[\.\,][0-9]{2})?
)$
And this variation captures the separators used:
^[+-]?[0-9]{1,3}
(?:
(?:(?<thousand>\,)[0-9]{3})*
(?:(?<decimal>\.)[0-9]{2})?
|
(?:(?<thousand>\.)[0-9]{3})*
(?:(?<decimal>\,)[0-9]{2})?
|
[0-9]*
(?:(?<decimal>[\.\,])[0-9]{2})?
)$
edit 1: "cents" are now optional; edit 2: text added; edit 3: second solution added; edit 4: complete solution added; edit 5: headings added; edit 6: capturing added; edit 7: last answer broke in two versions;
I would at first use this regex to determine wether a comma or a dot is used as a comma delimiter (It fetches the last of the two):
[0-9,\.]*([,\.])[0-9]*
I would then strip all of the other sign (which the previous didn't match). If there were no matches, you already have an integer and can skip the next steps. The removal of the chosen sign can easily be done with a regex, but there are also many other functions which can do this faster/better.
You are then left with a number in the form of an integer possible followed by a comma or a dot and then the decimals, where the integer- and decimal-part easily can be separated from eachother with the following regex.
([0-9]+)[,\.]?([0-9]*)
Good luck!
Edit:
Here is an example made in python, I assume the code should be self-explaining, if it is not, just ask.
import re
input = str(raw_input())
delimiterRegex = re.compile('[0-9,\.]*([,\.])[0-9]*')
splitRegex = re.compile('([0-9]+)[,\.]?([0-9]*)')
delimiter = re.findall(delimiterRegex, input)
if (delimiter[0] == ','):
input = re.sub('[\.]*','', input)
elif (delimiter[0] == '.'):
input = re.sub('[,]*','', input)
print input
With this code, the following inputs gives this:
111.111,11
111111,11
111,111.11
111111.11
111,111
111,111
After this step, one can now easily modify the string to match your needs.
How about
/(\d{1,3}(?:,\d{3})*)(\.\d{2})?/
if you care about validating that the commas separate every 3 digits exactly, or
/(\d[\d,]*)(\.\d{2})?/
if you don't.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With