Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: Why is Array.sort slow for large objects?

A colleague needed to sort an array of ActiveRecord objects in a Rails app. He tried the obvious Array.sort! but it seemed surprisingly slow, taking 32s for an array of 3700 objects. So just in case it was these big fat objects slowing things down, he reimplemented the sort by sorting an array of small objects, then reordering the original array of ActiveRecord objects to match - as shown in the code below. Tada! The sort now takes 700ms.

That really surprised me. Does Ruby's sort method end up copying objects about the place rather than just references? He's using Ruby 1.8.6/7.

def self.sort_events(events)
  event_sorters = Array.new(events.length) {|i| EventSorter.new(i, events[i])}
  event_sorters.sort!
  event_sorters.collect {|es| events[es.index]} 
end

private

# Class used by sort_events
class EventSorter
  attr_reader :sqn
  attr_reader :time
  attr_reader :index

  def initialize(index, event)
    @index = index  
    @sqn   = event.sqn
    @time  = event.time  
  end

  def <=>(b)
    @time != b.time ? @time <=> b.time : @sqn <=> b.sqn
  end
end
like image 236
David Waller Avatar asked Mar 13 '10 18:03

David Waller


1 Answers

sort definitely does not copy the objects. One difference that I can imagine between the code using EventSorter and the code without it (which you didn't supply, so I have to guess) is that EventSorter calls event.sqn and event.time exactly once and stores the result in variables. During the sorting only the variables need to be accessed. The original version presumably called sqn and time each time the sort-block was invoked.

If this is the case, it can be fixed by using sort_by instead of sort. sort_by only calls the block once per object and then uses the cached results of the block for further comparisons.

like image 71
sepp2k Avatar answered Sep 23 '22 18:09

sepp2k