Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to strip thousand separator from numeral string?

I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number/parseInt/parseFloat functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!

Better ideas are welcome too!


UPDATE:

Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.

MORE UPDATE:

JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)

Now, on parsing functions:

parseFloat('1023.95BARGAIN BYTES!')  // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!')      // while Number constructor behaves "strictly" and will return NaN

Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.

On validity of numerals:

'1,023.99' is perfectly well-formed English number, and stripping all commas will lead to correct result. '1,0,2,3.99' is broken, however generic comma stripping will give '1023.99' which is unlikely to be a correct result.

like image 708
OnTheFly Avatar asked Nov 18 '11 20:11

OnTheFly


2 Answers

welp, I'll venture to throw my suggestion into the pot:

Note: Revised

stringWithNumbers = stringwithNumbers.replace(/(\d+),(?=\d{3}(\D|$))/g, "$1");

should turn

1,234,567.12
1,023.99
1,0,2,3.99
the dang thing costs $1,205!!
95,5,0,432
12345,0000
1,2345

into:

1234567.12
1023.99
1,0,2,3.99
the dang thing costs $1205!!
95,5,0432
12345,0000
1,2345

I hope that's useful!

EDIT:

There is an additional alteration that may be necessary, but is not without side effects:

(\b\d{1,3}),(?=\d{3}(\D|$))

This changes the "one or more" quantifier (+) for the first set of digits into a "one to three" quantifier ({1,3}) and adds a "word-boundary" assertion before it. It will prevent replacements like 1234,123 ==> 1234123. However, it will also prevent a replacement that might be desired (if it is preceded by a letter or underscore), such as A123,789 or _1,555 (which will remain unchanged).

like image 179
Code Jockey Avatar answered Sep 20 '22 00:09

Code Jockey


A simple num.replace(/,/g, '') should be sufficient I think.

like image 25
J. K. Avatar answered Sep 20 '22 00:09

J. K.