Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting By Multiple Conditions in Ruby

I have a collection of Post objects and I want to be able to sort them based on these conditions:

  • First, by category (news, events, labs, portfolio, etc.)
  • Then by date, if date, or by position, if a specific index was set for it

Some posts will have dates (news and events), others will have explicit positions (labs, and portfolio).

I want to be able to call posts.sort!, so I've overridden <=>, but am looking for the most effective way of sorting by these conditions. Below is a pseudo method:

def <=>(other)
  # first, everything is sorted into 
  # smaller chunks by category
  self.category <=> other.category

  # then, per category, by date or position
  if self.date and other.date
    self.date <=> other.date
  else
    self.position <=> other.position
  end
end

It seems like I'd have to actually sort two separate times, rather than cramming everything into that one method. Something like sort_by_category, then sort!. What is the most ruby way to do this?

like image 410
Lance Avatar asked Apr 13 '10 23:04

Lance


2 Answers

You should always sort by the same criteria to insure a meaningful order. If comparing two nil dates, it is fine that the position will judge of the order, but if comparing one nil date with a set date, you have to decide which goes first, irrespective of the position (for example by mapping nil to a day way in the past).

Otherwise imagine the following:

a.date = nil                   ; a.position = 1
b.date = Time.now - 1.day      ; b.position = 2
c.date = Time.now              ; c.position = 0

By your original criteria, you would have: a < b < c < a. So, which one is the smallest??

You also want to do the sort at once. For your <=> implementation, use #nonzero?:

def <=>(other)
  return nil unless other.is_a?(Post)
  (self.category <=> other.category).nonzero? ||
  ((self.date || AGES_AGO) <=> (other.date || AGES_AGO)).nonzero? ||
  (self.position <=> other.position).nonzero? ||
  0
end

If you use your comparison criteria just once, or if that criteria is not universal and thus don't want to define <=>, you could use sort with a block:

post_ary.sort{|a, b| (a.category <=> ...).non_zero? || ... }

Better still, there is sort_by and sort_by! which you can use to build an array for what to compare in which priority:

post_ary.sort_by{|a| [a.category, a.date || AGES_AGO, a.position] }

Besides being shorter, using sort_by has the advantage that you can only obtain a well ordered criteria.

Notes:

  • sort_by! was introduced in Ruby 1.9.2. You can require 'backports/1.9.2/array/sort_by' to use it with older Rubies.
  • I'm assuming that Post is not a subclass of ActiveRecord::Base (in which case you'd want the sort to be done by the db server).
like image 157
Marc-André Lafortune Avatar answered Sep 28 '22 03:09

Marc-André Lafortune


Alternatively you could do the sort in one fell swoop in an array, the only gotcha is handling the case where one of the attributes is nil, although that could still be handled if you knew the data set by selecting the appropriate nil guard. Also it's not clear from your psuedo code if the date and position comparisons are listed in a priority order or an one or the other (i.e. use date if exists for both else use position). First solution assumes use, category, followed by date, followed by position

def <=>(other)
    [self.category, self.date, self.position] <=> [other.category, other.date, other.position]
end

Second assumes it's date or position

def <=>(other)
    if self.date && other.date
        [self.category, self.date] <=> [other.category, other.date]
    else
        [self.category, self.position] <=> [other.category, other.position]
    end
end
like image 44
naven87 Avatar answered Sep 28 '22 03:09

naven87