Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is sum or reduce(:+) better in Ruby/Rails? Are there considerations other than speed?

It seems like #sum is faster than #reduce for long arrays, and they are basically the same for short ones.

def reduce_t(s,f)
  start = Time.now
  puts (s..f).reduce(:+) #Printing the result just to make sure something is happening.
  finish = Time.now
  puts finish - start
end
def sum_t(s,f)
  start = Time.now
  puts (s..f).sum
  finish = Time.now
  puts finish - start
end

irb(main):078:0> sum_t(1,10); reduce_t(1,10)
55
0.000445
55
0.000195
=> nil
irb(main):079:0> sum_t(1,1000000); reduce_t(1,1000000)
500000500000
8.1e-05
500000500000
0.101487
=> nil

Are there considerations other than speed? Are there any situations when it would be better to use #reduce instead of #sum to accomplish the same end, a simple sum?

Edit

mu is too short rightly pointed out that I should do numerous iterations before drawing conclusions about timing results. I didn't use Benchmark because I'm not familiar with it yet, but I hope what I've written below will be adequate and convincing.

def sum_reduce_t(s,f)
  time_reduce = 0
  time_sum = 0
  reduce_faster = 0
  sum_faster = 0
  30.times do
    start_reduce = Time.now
    (s..f).reduce(:+)
    finish_reduce = Time.now
    time_reduce += (finish_reduce - start_reduce)
    start_sum = Time.now
    (s..f).sum
    finish_sum = Time.now
    time_sum += (finish_sum - start_sum)
    if time_sum > time_reduce
      reduce_faster += 1
    else
      sum_faster += 1
    end
  end
  puts "Total time (s) spent on reduce: #{time_reduce}"
  puts "Total time (s) spent on sum: #{time_sum}"
  puts "Number of times reduce is faster: #{reduce_faster}"
  puts "Number of times sum is faster: #{sum_faster}"
end

irb(main):205:0> sum_reduce_t(1,10)
Total time (s) spent on reduce: 0.00023900000000000004
Total time (s) spent on sum: 0.00015400000000000003
Number of times reduce is faster: 0
Number of times sum is faster: 30
=> nil
irb(main):206:0> sum_reduce_t(1,100)
Total time (s) spent on reduce: 0.0011480000000000004
Total time (s) spent on sum: 0.00024999999999999995
Number of times reduce is faster: 0
Number of times sum is faster: 30
=> nil
irb(main):207:0> sum_reduce_t(1,1000)
Total time (s) spent on reduce: 0.004804000000000001
Total time (s) spent on sum: 0.00019899999999999996
Number of times reduce is faster: 0
Number of times sum is faster: 30
=> nil
irb(main):208:0> sum_reduce_t(1,10000)
Total time (s) spent on reduce: 0.031862
Total time (s) spent on sum: 0.00010299999999999996
Number of times reduce is faster: 0
Number of times sum is faster: 30
=> nil
irb(main):209:0> sum_reduce_t(1,100000)
Total time (s) spent on reduce: 0.286317
Total time (s) spent on sum: 0.00013199999999999998
Number of times reduce is faster: 0
Number of times sum is faster: 30
=> nil
irb(main):210:0> sum_reduce_t(1,1000000)
Total time (s) spent on reduce: 2.7116779999999996
Total time (s) spent on sum: 0.00021200000000000008
Number of times reduce is faster: 0
Number of times sum is faster: 30
=> nil    

My question remains: are there ever times when it makes sense to use #reduce instead of #sum?

like image 847
Leo Folsom Avatar asked Dec 11 '22 14:12

Leo Folsom


1 Answers

One way that the behaviour and result of using sum differ from inject &:+ is when you are summing floating point values.

If you add a large floating point value to a small one, often the result is just the same as the larger one:

> 99999999999999.98 + 0.001
=> 99999999999999.98

This can lead to errors when adding arrays of floats, as the smaller values are effectively lost, even if there is a lot of them.

For example:

> a = [99999999999999.98, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001]
=> [99999999999999.98, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001]
> a.inject(&:+)
=> 99999999999999.98

In this example, you could add 0.001 as often as you want, it would never change the value of the result.

Ruby’s implementation of sum uses the Kahan summation algorithm when summing floats to reduce this error:

> a.sum
=> 100000000000000.0

(Note the result here, you might be expecting something ending in .99 as there are 10 0.001 in the array. This is just normal floating point behaviour, perhaps I should have tried to find a better example. The important point is that the sum does increase as you add lots of small values, which doesn’t happen with inject &:+.)

like image 195
matt Avatar answered Dec 15 '22 00:12

matt