Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Safe integer parsing in Ruby

Ruby has this functionality built in:

Integer('1001')                                    # => 1001  
Integer('1001 nights')  
# ArgumentError: invalid value for Integer: "1001 nights"  

As noted in answer by Joseph Pecoraro, you might want to watch for strings that are valid non-decimal numbers, such as those starting with 0x for hex and 0b for binary, and potentially more tricky numbers starting with zero that will be parsed as octal.

Ruby 1.9.2 added optional second argument for radix so above issue can be avoided:

Integer('23')                                     # => 23
Integer('0x23')                                   # => 35
Integer('023')                                    # => 19
Integer('0x23', 10)
# => #<ArgumentError: invalid value for Integer: "0x23">
Integer('023', 10)                                # => 23

This might work:

i.to_i if i.match(/^\d+$/)

Also be aware of the affects that the current accepted solution may have on parsing hex, octal, and binary numbers:

>> Integer('0x15')
# => 21  
>> Integer('0b10')
# => 2  
>> Integer('077')
# => 63

In Ruby numbers that start with 0x or 0X are hex, 0b or 0B are binary, and just 0 are octal. If this is not the desired behavior you may want to combine that with some of the other solutions that check if the string matches a pattern first. Like the /\d+/ regular expressions, etc.


Another unexpected behavior with the accepted solution (with 1.8, 1.9 is ok):

>> Integer(:foobar)
=> 26017
>> Integer(:yikes)
=> 26025

so if you're not sure what is being passed in, make sure you add a .to_s.


I like Myron's answer but it suffers from the Ruby disease of "I no longer use Java/C# so I'm never going to use inheritance again". Opening any class can be fraught with danger and should be used sparingly, especially when it's part of Ruby's core library. I'm not saying don't ever use it, but it's usually easy to avoid and that there are better options available, e.g.

class IntegerInString < String

  def initialize( s )
    fail ArgumentError, "The string '#{s}' is not an integer in a string, it's just a string." unless s =~ /^\-?[0-9]+$/
    super
  end
end

Then when you wish to use a string that could be a number it's clear what you're doing and you don't clobber any core class, e.g.

n = IntegerInString.new "2"
n.to_i
# => 2

IntegerInString.new "blob"
ArgumentError: The string 'blob' is not an integer in a string, it's just a string.

You can add all sorts of other checks in the initialize, like checking for binary numbers etc. The main thing though, is that Ruby is for people and being for people means clarity. Naming an object via its variable name and its class name makes things much clearer.


I had to deal with this in my last project, and my implementation was similar, but a bit different:

class NotAnIntError < StandardError 
end

class String
  def is_int?    
    self =~ /^-?[0-9]+$/
  end

  def safe_to_i
    return self.to_i if is_int?
    raise NotAnIntError, "The string '#{self}' is not a valid integer.", caller
  end
end

class Integer
  def safe_to_i
    return self
  end            
end

class StringExtensions < Test::Unit::TestCase

  def test_is_int
    assert "98234".is_int?
    assert "-2342".is_int?
    assert "02342".is_int?
    assert !"+342".is_int?
    assert !"3-42".is_int?
    assert !"342.234".is_int?
    assert !"a342".is_int?
    assert !"342a".is_int?
  end

  def test_safe_to_i
    assert 234234 == 234234.safe_to_i
    assert 237 == "237".safe_to_i
    begin
      "a word".safe_to_i
      fail 'safe_to_i did not raise the expected error.'
    rescue NotAnIntError 
      # this is what we expect..
    end
  end

end

someString = "asdfasd123"
number = someString.to_i
if someString != number.to_s
  puts "oops, this isn't a number"
end

Probably not the cleanest way to do it, but should work.