Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby (on Rails) Regex: removing thousands comma from numbers

This seems like a simple one, but I am missing something.

I have a number of inputs coming in from a variety of sources and in different formats.

Number inputs

123
123.45
123,45 (note the comma used here to denote decimals)
1,234
1,234.56
12,345.67
12,345,67 (note the comma used here to denote decimals)

Additional info on the inputs

  • Numbers will always be less than 1 million
  • EDIT: These are prices, so will either be whole integers or go to the hundredths place

I am trying to write a regex and use gsub to strip out the thousands comma. How do I do this?

I wrote a regex: myregex = /\d+(,)\d{3}/

When I test it in Rubular, it shows that it captures the comma only in the test cases that I want.

But when I run gsub, I get an empty string: inputstr.gsub(myregex,"")

It looks like gsub is capturing everything, not just the comma in (). Where am I going wrong?

like image 454
bigwinner Avatar asked Jan 30 '13 21:01

bigwinner


2 Answers

result = inputstr.gsub(/,(?=\d{3}\b)/, '')

removes commas only if exactly three digits follow.

(?=...) is a lookahead assertion: It needs to be possible to be matched at the current position, but it's not becoming part of the text that is actually matched (and subsequently replaced).

like image 123
Tim Pietzcker Avatar answered Sep 23 '22 05:09

Tim Pietzcker


You are confusing "match" with "capture": to "capture" means to save something so you can refer to it later. You want to capture not the comma, but everything else, and then use the captured portions to build your substitution string.

Try

myregex = /(\d+),(\d{3})/

inputstr.gsub(myregex,'\1\2')
like image 29
theglauber Avatar answered Sep 21 '22 05:09

theglauber