Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Array Sort 2 different ways

I have an array of objects that I'm trying to sort by multiple criteria. Most of the comparisons are just by doing <=> on their hashes, so using sort_by is very fast, but one of them is more complex.

The array is of soccer teams, and it's currently being sorted like this:

teams.sort_by { |item| [item.points, item.goal_dif, item.goals] }

However, in case at last 2 teams have identical values on those 3 fields, I want the tiebreaker to be a function that I made, a_beat_b(teamA, teamB).

I tried using Array.sort, but it's extremely slow compared to sort_by for those first few... My implementation was like this:

teams.sort ( |a,b| [a.points, a.goals_dif, a.goals] <=> [b.points, b.goals_dif, b.goals])
It was very slow compared to sort_by. The functions for points, goals_dif and goals require some simple queries, but it gets bogged down if it has to do hundreds.

I'm not very good at Ruby, so not sure where to put my a_beats_b in there. (It returns 1, 0 or -1 if A beat, drew or lost to B, repsectively)

like image 660
Andre Avatar asked Sep 07 '16 15:09

Andre


3 Answers

I tried using Array.sort, but it's extremely slow compared to sort_by for those first few

This is because sort calls the given block several times. Here's an example to show what's going on under the hood: (sorting "apple", "pear" and "fig" by length)

def length(str)
  puts "calculating #{str.inspect}.length"
  str.length
end

array = %w{apple pear fig}
array.sort { |a, b| length(a) <=> length(b) }
#=> ["fig", "pear", "apple"]

Output from our length method:

calculating "apple".length
calculating "pear".length
calculating "apple".length
calculating "fig".length
calculating "pear".length
calculating "fig".length

As you can see, length is called multiple times during the sort. Imagine that these are database queries.

sort_by on the other hand calls the block once for each element, building an internal mapping:

array.sort_by { |a| length(a) }
#=> ["fig", "pear", "apple"]

Output:

calculating "apple".length
calculating "pear".length
calculating "fig".length

For expensive operations (like database queries), this is much faster. But it's also less flexible – you can't dynamically compare a and b any more.

You can however store the results of your (expensive) operation, for example by using a hash: (this is called memoization)

hash = Hash.new { |h, k| h[k] = length(k) }

And use the hash within sort:

array.sort { |a, b| hash[a] <=> hash[b] }
# calculating "apple".length
# calculating "pear".length
# calculating "fig".length
#=> ["fig", "pear", "apple"]

After the sort, our hash looks like this:

hash #=> {"apple"=>5, "pear"=>4, "fig"=>3}

Applied to your code, something like this should work:

hash = Hash.new { |h, k| h[k] = [k.points, k.goal_dif, k.goals] }
teams.sort { |a, b| hash[a] == hash[b] ? a_beats_b(a, b) : hash[a] <=> hash[b] }
like image 123
Stefan Avatar answered Oct 19 '22 16:10

Stefan


Implementation of sort with a_beats_b included:

teams.sort do |a, b|
  result = [a.points, a.goals_dif, a.goals] <=> [b.points, b.goals_dif, b.goals]
  result.zero? ? a_beats_b(a, b) : result
end
like image 37
Aleksei Matiushkin Avatar answered Oct 19 '22 16:10

Aleksei Matiushkin


Here's another approach that, while somewhat complex, has been designed for efficiency. The method uses the following steps.

  • Convert each Team instance to to an array containing the instance and the three-element array on which the inexpensive sort is to be done.
  • Use Enumerable#sort_by to sort the arrays by the three-element arrays.
  • Use Enumerable#chunk to group the two-element arrays that have equal three-element arrays.
  • Map each chunked array element to the Team instance in the two-element array.
  • Use Enumerable#flat_map to map each chunked group of Team instances after it has been sorted by the method a_beat_b(a, b) (unless the group contains only one team, of course).

Code

def sort_em(teams)
  teams.map { |t| [t, [t.points, t.goal_dif, t.goals]] }.
        sort_by(&:last).
        chunk(&:last).
        map { |_,tied_teams| tied_teams.map(&:first) }.
        flat_map { |tied_teams| (tied_teams.size == 1) ?
          tied_teams.first : tied_teams.sort { |a,b| a_beat_b(a, b) } }
end

Example

class Team
  attr_reader :name, :points, :goal_dif, :goals
  def initialize(name, points, goal_dif, goals)
    @name, @points, @goal_dif, @goals = name, points, goal_dif, goals
  end
end

teams = [Team.new("bluebirds", 233, 25, 130),
         Team.new("eagles",    233, 18, 105),
         Team.new("jays",      233, 25, 130),
         Team.new("owls",      160, 12, 105),
         Team.new("sparrows",  233, 18, 105)
        ]
  #=> [#<Team:0x007ff2f900e5a8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2f900e530 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2f900e4b8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2f900e440 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    #<Team:0x007ff2f900e3c8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>]

def a_beat_b(a, b)
  a.name.size <=> b.name.size
end

sort_em(teams)
  #=> [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    #<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>]

Explanation

The steps are as follows.

a = teams.map { |t| [t, [t.points, t.goal_dif, t.goals]] }
  #=> [[#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #     [233, 25, 130]],
  #    [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #     [233, 18, 105]],
  #    [#<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #     [233, 25, 130]],
  #    [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #     [160, 12, 105]],
  #    [#<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #     [233, 18, 105]]] 
b = a.sort_by(&:last)
  #=> [[#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    [160, 12, 105]],
  #   [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    [233, 18, 105]],
  #   [#<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #    [233, 18, 105]],
  #   [#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #    [233, 25, 130]],
  #   [#<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    [233, 25, 130]]
  #   ] 
c = b.chunk(&:last)
  #=> #<Enumerator: #<Enumerator::Generator:0x007ff2fa81dc20>:each> 

To see what values are generated by the enumerator c we can convert it to an array.

c.to_a 
  #=> [[[160, 12, 105],
  #     [[#<Team:0x007ff2fa845630 @name="owls",@points=160,@goal_dif=12,@goals=105>,
  #       [160, 12, 105]
  #      ]
  #     ]
  #    ],
  #    [[233, 18, 105],
  #     [[#<Team:0x007ff2fa845720 @name="eagles",@points=233,@goal_dif=18,@goals=105>,
  #       [233, 18, 105]
  #      ],
  #     [#<Team:0x007ff2fa8455b8 @name="sparrows",@points=233,@goal_dif=18,@goals=105>,
  #       [233, 18, 105]
  #     ]
  #    ],
  #    [[233, 25, 130],
  #     [[#<Team:0x007ff2fa8457e8 @name="bluebirds",@points=233,@goal_dif=25,@goals=130>,
  #       [233, 25, 130]
  #      ],
  #      [#<Team:0x007ff2fa8456a8 @name="jays", @points=233,@goal_dif=25,@goals=130>,
  #       [233, 25, 130]
  #      ]
  #     ]
  #    ]
  #   ]

d = c.map { |_,tied_teams| tied_teams.map(&:first) }
  #=> [[#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>],
  #    [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #     #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>
  #    ],
  #    [#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #     #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>
  #    ]
  #   ] 
d.flat_map { |tied_teams| (tied_teams.size == 1) ?
  tied_teams.first : tied_teams.sort { |a,b| a_beat_b(a, b) } }
  #=> [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    #<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>
  #   ] 
like image 35
Cary Swoveland Avatar answered Oct 19 '22 17:10

Cary Swoveland