I was looking for a short code that can put commas in set of numbers until I came to this site.
The code:
function addCommas(nStr)
{
nStr += '';
x = nStr.split('.');
x1 = x[0];
x2 = x.length > 1 ? '.' + x[1] : '';
var rgx = /(\d+)(\d{3})/;
while (rgx.test(x1)) {
x1 = x1.replace(rgx, '$1' + ',' + '$2');
}
return x1 + x2;
}
Works really great. Having this example set of number:
addCommas('83475934.89');
Will return "83,475,934.89"
, but when I read the code, I expect it to return 8,3,4,7,5,934.89
but this sites explains that
\d+
in combination with\d{3}
will match a group of 3 numbers preceded by any amount of numbers. This tricks the search into replacing from right to left.
And I get so confused.
How does this code read from right to left? Plus, what does $1
and $2
mean?
It isn't actually reading right-to-left. What's really happening is that it's repeatedly applying the (\d+)(\d{3})
pattern (via a while loop) and replacing until it no longer matches the pattern. In other words:
Iteration 1:
x1 = 83475934.89
x1.replace((\d+)(\d{3}), '$1' + ',' + '$2');
x1 = 83475,934.89
Iteration 2:
x1 = 83475,934.89
x1.replace((\d+)(\d{3}), '$1' + ',' + '$2');
x1 = 83,475,934.89
Iteration 3:
x1 = 83,475,934.89
x1.replace((\d+)(\d{3}), '$1' + ',' + '$2');
// no match; end loop
Edit:
Plus, what does $1 and $2 mean?
Those are back references to the matching groups (\d+)
and (\d{3})
respectively.
Here's a great reference for learning how Regular Expressions actually work:
http://www.regular-expressions.info/quickstart.html
It matches from right to left because it uses greedy pattern matching. This means that it first finds all the digits (the \d+), then tries to find the \d{3}. In the number 2421567.56, for example it would first match the digits up until the '.' - 2431567 - then works backwards to match the next 3 digits (567) in the next part of the regex. It does this in a loop adding a comma between the $1 and $2 variables.
The $'s represent matching groups formed in the regex with parentheses e.g. the (\d+) = $1 and (\d{3}) = $2. In this way it can easily add characters between them.
In the next iteration, the greedy matching stops at the newly created comma instead, and it continues until it can't match > 3 digits.
I wrote a regular expression which does the same thing in a single pass:
/(?!\b)(\d{3}(?=(\d{3})*\b))/g
Try this for example with varying numbers at the start:
var num = '1234567890123456';
for(var i = 1; i <= num.length; i++)
{
console.log(num.slice(0, -i).replace(/(?!\b)(\d{3}(?=(\d{3})*\b))/g, ',$1'));
}
I'll try to break it down here:
Ignore this bit for now - I'll come back to that.
(?!\b)(\d{3}(?=(\d{3})*\b))
It still reads from left to right trying to capture blocks of 3 digits. Here's the capturing group.
(?!\b)(\d{3}(?=(\d{3})*\b))
However, inside the capturing group, it uses a lookahead.
(?!\b)(\d{3}(?=(\d{3})*\b))
The lookahead looks for any multiple of 3 digits anchored to the end of the number - the terminating boundary. This aligns the capture to multiples of 3 from the right-hand end of the number. This means it works with decimal numbers too (unless they are more than 3 decimal places, in which case it will put commas in them too. It ain't perfect).
(?!\b)(\d{3}(?=(\d{3})*\b))
The problem I had was that JavaScript doesn't support atomic look-behinds so, when the number has a multiple of 3 digits, it was matching the first 3 digits and placing a comma at the very start of the number.
You can't match a character before the 3 digit match without throwing off the repetition, so I had to use a negative lookahead that matches a word-boundary. It's kinda the opposite of putting ^
at the start.
(?!\b)(\d{3}(?=(\d{3})*$))
Essentially it prevents the expression from matching from the start of the string.
Which would be bad.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With