Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to get maximum value from an exclusive Range in ruby

Ok, so say you have a really big Range in ruby. I want to find a way to get the max value in the Range.

The Range is exclusive (defined with three dots) meaning that it does not include the end object in it's results. It could be made up of Integer, String, Time, or really any object that responds to #<=> and #succ. (which are the only requirements for the start/end object in Range)

Here's an example of an exclusive range:

  past  = Time.local(2010, 1, 1, 0, 0, 0)
  now   = Time.now
  range = past...now

  range.include?(now)  # => false

Now I know I could just do something like this to get the max value:

  range.max  # => returns 1 second before "now" using Enumerable#max

But this will take a non-trivial amount of time to execute. I also know that I could subtract 1 second from whatever the end object is is. However, the object may be something other than Time, and it may not even support #-. I would prefer to find an efficient general solution, but I am willing to combine special case code with a fallback to a general solution (more on that later).

As mentioned above using Range#last won't work either, because it's an exclusive range and does not include the last value in it's results.

The fastest approach I could think of was this:

  max = nil
  range.each { |value| max = value }

  # max now contains nil if the range is empty, or the max value

This is similar to what Enumerable#max does (which Range inherits), except that it exploits the fact that each value is going to be greater than the previous, so we can skip using #<=> to compare the each value with the previous (the way Range#max does) saving a tiny bit of time.

The other approach I was thinking about was to have special case code for common ruby types like Integer, String, Time, Date, DateTime, and then use the above code as a fallback. It'd be a bit ugly, but probably much more efficient when those object types are encountered because I could use subtraction from Range#last to get the max value without any iterating.

Can anyone think of a more efficient/faster approach than this?

like image 922
dkubb Avatar asked Feb 18 '10 07:02

dkubb


3 Answers

The simplest solution that I can think of, which will work for inclusive as well as exclusive ranges:

range.max

Some other possible solutions:

range.entries.last
range.entries[-1]

These solutions are all O(n), and will be very slow for large ranges. The problem in principle is that range values in Ruby are enumerated using the succ method iteratively on all values, starting at the beginning. The elements do not have to implement a method to return the previous value (i.e. pred).

The fastest method would be to find the predecessor of the last item (an O(1) solution):

range.exclude_end? ? range.last.pred : range.last

This works only for ranges that have elements which implement pred. Later versions of Ruby implement pred for integers. You have to add the method yourself if it does not exist (essentially equivalent to special case code you suggested, but slightly simpler to implement).

Some quick benchmarking shows that this last method is the fastest by many orders of magnitude for large ranges (in this case range = 1...1000000), because it is O(1):

                                          user     system      total        real
r.entries.last                       11.760000   0.880000  12.640000 ( 12.963178)
r.entries[-1]                        11.650000   0.800000  12.450000 ( 12.627440)
last = nil; r.each { |v| last = v }  20.750000   0.020000  20.770000 ( 20.910416)
r.max                                17.590000   0.010000  17.600000 ( 17.633006)
r.exclude_end? ? r.last.pred : r.last 0.000000   0.000000   0.000000 (  0.000062)

Benchmark code is here.

In the comments it is suggested to use range.last - (range.exclude_end? ? 1 : 0). It does work for dates without additional methods, but will never work for non-numeric ranges. String#- does not exist and makes no sense with integer arguments. String#pred, however, can be implented.

like image 144
molf Avatar answered Oct 30 '22 13:10

molf


I'm not sure about the speed (and initial tests don't seem incredibly fast), but the following might do what you need:

past  = Time.local(2010, 1, 1, 0, 0, 0)
now   = Time.now
range = past...now

range.to_a[-1]

Very basic testing (counting in my head) showed that it took about 4 seconds while the method you provided took about 5-6. Hope this helps.

Edit 1: Removed second solution as it was totally wrong.

like image 44
Topher Fangio Avatar answered Oct 30 '22 13:10

Topher Fangio


I can't think there's any way to achieve this that doesn't involve enumerating the range, at least unless as already mentioned, you have other information about how the range will be constructed and therefore can infer the desired value without enumeration. Of all the suggestions, I'd go with #max, since it seems to be most expressive.

require 'benchmark'
N = 20
Benchmark.bm(30) do |r|
  past, now  = Time.local(2010, 2, 1, 0, 0, 0), Time.now
  @range = past...now
  r.report("range.max") do
    N.times { last_in_range = @range.max }
  end
  r.report("explicit enumeration") do
    N.times { @range.each { |value| last_in_range = value } }
  end
  r.report("range.entries.last") do
    N.times { last_in_range = @range.entries.last }
  end
  r.report("range.to_a[-1]") do
    N.times { last_in_range = @range.to_a[-1] }
  end
end
                                user     system      total        real
range.max                      49.406000   1.515000  50.921000 ( 50.985000)
explicit enumeration           52.250000   1.719000  53.969000 ( 54.156000)
range.entries.last             53.422000   4.844000  58.266000 ( 58.390000)
range.to_a[-1]                 49.187000   5.234000  54.421000 ( 54.500000)

I notice that the 3rd and 4th option have significantly increased system time. I expect that's related to the explicit creation of an array, which seems like a good reason to avoid them, even if they're not obviously more expensive in elapsed time.

like image 43
Mike Woodhouse Avatar answered Oct 30 '22 15:10

Mike Woodhouse