I have an array of objects that I'm trying to sort by multiple criteria. Most of the comparisons are just by doing <=>
on their hashes, so using sort_by
is very fast, but one of them is more complex.
The array is of soccer teams, and it's currently being sorted like this:
teams.sort_by { |item| [item.points, item.goal_dif, item.goals] }
However, in case at last 2 teams have identical values on those 3 fields, I want the tiebreaker to be a function that I made, a_beat_b(teamA, teamB)
.
I tried using Array.sort
, but it's extremely slow compared to sort_by
for those first few... My implementation was like this:
teams.sort ( |a,b| [a.points, a.goals_dif, a.goals] <=> [b.points, b.goals_dif, b.goals])
It was very slow compared to sort_by. The functions for points, goals_dif and goals require some simple queries, but it gets bogged down if it has to do hundreds.
I'm not very good at Ruby, so not sure where to put my a_beats_b
in there. (It returns 1, 0 or -1 if A beat, drew or lost to B, repsectively)
I tried using
Array.sort
, but it's extremely slow compared tosort_by
for those first few
This is because sort
calls the given block several times. Here's an example to show what's going on under the hood: (sorting "apple"
, "pear"
and "fig"
by length)
def length(str)
puts "calculating #{str.inspect}.length"
str.length
end
array = %w{apple pear fig}
array.sort { |a, b| length(a) <=> length(b) }
#=> ["fig", "pear", "apple"]
Output from our length
method:
calculating "apple".length
calculating "pear".length
calculating "apple".length
calculating "fig".length
calculating "pear".length
calculating "fig".length
As you can see, length
is called multiple times during the sort. Imagine that these are database queries.
sort_by
on the other hand calls the block once for each element, building an internal mapping:
array.sort_by { |a| length(a) }
#=> ["fig", "pear", "apple"]
Output:
calculating "apple".length
calculating "pear".length
calculating "fig".length
For expensive operations (like database queries), this is much faster. But it's also less flexible – you can't dynamically compare a
and b
any more.
You can however store the results of your (expensive) operation, for example by using a hash: (this is called memoization)
hash = Hash.new { |h, k| h[k] = length(k) }
And use the hash within sort
:
array.sort { |a, b| hash[a] <=> hash[b] }
# calculating "apple".length
# calculating "pear".length
# calculating "fig".length
#=> ["fig", "pear", "apple"]
After the sort, our hash looks like this:
hash #=> {"apple"=>5, "pear"=>4, "fig"=>3}
Applied to your code, something like this should work:
hash = Hash.new { |h, k| h[k] = [k.points, k.goal_dif, k.goals] }
teams.sort { |a, b| hash[a] == hash[b] ? a_beats_b(a, b) : hash[a] <=> hash[b] }
Implementation of sort
with a_beats_b
included:
teams.sort do |a, b|
result = [a.points, a.goals_dif, a.goals] <=> [b.points, b.goals_dif, b.goals]
result.zero? ? a_beats_b(a, b) : result
end
Here's another approach that, while somewhat complex, has been designed for efficiency. The method uses the following steps.
Team
instance to to an array containing the instance and the three-element array on which the inexpensive sort is to be done.Team
instance in the two-element array. Team
instances after it has been sorted by the method a_beat_b(a, b)
(unless the group contains only one team, of course). Code
def sort_em(teams)
teams.map { |t| [t, [t.points, t.goal_dif, t.goals]] }.
sort_by(&:last).
chunk(&:last).
map { |_,tied_teams| tied_teams.map(&:first) }.
flat_map { |tied_teams| (tied_teams.size == 1) ?
tied_teams.first : tied_teams.sort { |a,b| a_beat_b(a, b) } }
end
Example
class Team
attr_reader :name, :points, :goal_dif, :goals
def initialize(name, points, goal_dif, goals)
@name, @points, @goal_dif, @goals = name, points, goal_dif, goals
end
end
teams = [Team.new("bluebirds", 233, 25, 130),
Team.new("eagles", 233, 18, 105),
Team.new("jays", 233, 25, 130),
Team.new("owls", 160, 12, 105),
Team.new("sparrows", 233, 18, 105)
]
#=> [#<Team:0x007ff2f900e5a8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
# #<Team:0x007ff2f900e530 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
# #<Team:0x007ff2f900e4b8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
# #<Team:0x007ff2f900e440 @name="owls", @points=160, @goal_dif=12, @goals=105>,
# #<Team:0x007ff2f900e3c8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>]
def a_beat_b(a, b)
a.name.size <=> b.name.size
end
sort_em(teams)
#=> [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
# #<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
# #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
# #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
# #<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>]
Explanation
The steps are as follows.
a = teams.map { |t| [t, [t.points, t.goal_dif, t.goals]] }
#=> [[#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
# [233, 25, 130]],
# [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
# [233, 18, 105]],
# [#<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
# [233, 25, 130]],
# [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
# [160, 12, 105]],
# [#<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
# [233, 18, 105]]]
b = a.sort_by(&:last)
#=> [[#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
# [160, 12, 105]],
# [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
# [233, 18, 105]],
# [#<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
# [233, 18, 105]],
# [#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
# [233, 25, 130]],
# [#<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
# [233, 25, 130]]
# ]
c = b.chunk(&:last)
#=> #<Enumerator: #<Enumerator::Generator:0x007ff2fa81dc20>:each>
To see what values are generated by the enumerator c
we can convert it to an array.
c.to_a
#=> [[[160, 12, 105],
# [[#<Team:0x007ff2fa845630 @name="owls",@points=160,@goal_dif=12,@goals=105>,
# [160, 12, 105]
# ]
# ]
# ],
# [[233, 18, 105],
# [[#<Team:0x007ff2fa845720 @name="eagles",@points=233,@goal_dif=18,@goals=105>,
# [233, 18, 105]
# ],
# [#<Team:0x007ff2fa8455b8 @name="sparrows",@points=233,@goal_dif=18,@goals=105>,
# [233, 18, 105]
# ]
# ],
# [[233, 25, 130],
# [[#<Team:0x007ff2fa8457e8 @name="bluebirds",@points=233,@goal_dif=25,@goals=130>,
# [233, 25, 130]
# ],
# [#<Team:0x007ff2fa8456a8 @name="jays", @points=233,@goal_dif=25,@goals=130>,
# [233, 25, 130]
# ]
# ]
# ]
# ]
d = c.map { |_,tied_teams| tied_teams.map(&:first) }
#=> [[#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>],
# [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
# #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>
# ],
# [#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
# #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>
# ]
# ]
d.flat_map { |tied_teams| (tied_teams.size == 1) ?
tied_teams.first : tied_teams.sort { |a,b| a_beat_b(a, b) } }
#=> [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
# #<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
# #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
# #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
# #<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>
# ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With