Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is array.min so slow?

I noticed that array.min seems slow, so I did this test against my own naive implementation:

require 'benchmark'
array = (1..100000).to_a.shuffle

Benchmark.bmbm(5) do |x|
  x.report("lib:") { 99.times { min = array.min } }
  x.report("own:") { 99.times { min = array[0]; array.each { |n| min = n if n < min } } }
end

The results:

Rehearsal -----------------------------------------
lib:    1.531000   0.000000   1.531000 (  1.538159)
own:    1.094000   0.016000   1.110000 (  1.102130)
-------------------------------- total: 2.641000sec

            user     system      total        real
lib:    1.500000   0.000000   1.500000 (  1.515249)
own:    1.125000   0.000000   1.125000 (  1.145894)

I'm shocked. How can my own implementation running a block via each beat the built-in? And beat it by so much?

Am I somehow mistaken? Or is this somehow normal? I'm confused.


My Ruby version, running on Windows 8.1 Pro:

C:\>ruby --version
ruby 2.2.3p173 (2015-08-18 revision 51636) [i386-mingw32]
like image 751
Stefan Pochmann Avatar asked Jan 09 '16 15:01

Stefan Pochmann


2 Answers

For those who likes to upgrade to newer versions of software

require 'benchmark'
array = (1..100000).to_a.shuffle

Benchmark.bmbm(5) do |x|
  x.report("lib:") { 99.times { min = array.min } }
  x.report("own:") { 99.times { min = array[0]; array.each { |n| min = n if n < min } } }
end

Rehearsal -----------------------------------------
lib:    0.021326   0.000017   0.021343 (  0.021343)
own:    0.498233   0.001024   0.499257 (  0.499746)
-------------------------------- total: 0.520600sec

            user     system      total        real
lib:    0.018126   0.000000   0.018126 (  0.018139)
own:    0.492046   0.000000   0.492046 (  0.492367)

RUBY_VERSION # => "2.7.1"

If you are looking into solving this in really performant manner: O(log(n)) or O(n), look at https://en.wikipedia.org/wiki/Selection_algorithm#Incremental_sorting_by_selection and https://en.wikipedia.org/wiki/Heap_(data_structure)

like image 142
ted Avatar answered Oct 09 '22 07:10

ted


Have a look at the implementation of Enumerable#min. It might use each eventually to loop through the elements and get the min element, but before that it does some extra checking to see if it needs to return more than one element, or if it needs to compare the elements via a passed block. In your case the elements will get to be compared via min_i function, and I suspect that's where the speed difference comes from - that function will be slower than simply comparing two numbers.

There's no extra optimization for arrays, all enumerables are traversed the same way.

like image 22
eugen Avatar answered Oct 09 '22 09:10

eugen