Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting a string that could contain either a time or distance

Tags:

I have implemented a sorting algorithm for a custom string that represents either time or distance data for track & field events. Below is the format

'10:03.00 - Either ten minutes and three seconds or 10 feet, three inches

The result of the sort is that for field events, the longest throw or jump would be the first element while for running events, the fastest time would be first. Below is the code I am currently using for field events. I didn't post the running_event_sort since it is the same logic with the greater than/less than swapped. While it works, it just seems overly complex and needs to be refactored. I am open to suggestions. Any help would be great.

event_participants.sort!{ |a, b| Participant.field_event_sort(a, b) }

class Participant
def self.field_event_sort(a, b)
  a_parts = a.time_distance.scan(/'([\d]*):([\d]*).([\d]*)/)
  b_parts = b.time_distance.scan(/'([\d]*):([\d]*).([\d]*)/)

  if(a_parts.empty? || b_parts.empty?)
    0
  elsif a_parts[0][0] == b_parts[0][0]
    if a_parts[0][1] == b_parts[0][1]
      if a_parts[0][2] > b_parts[0][2]
        -1
      elsif a_parts[0][2] < b_parts[0][2]
        1
      else
        0
      end
    elsif a_parts[0][1] > b_parts[0][1]
      -1
    else
      1
    end  
  elsif a_parts[0][0] > b_parts[0][0] 
    -1
  else
    1
  end
end
end
like image 737
Chris Williams Avatar asked Jun 07 '09 23:06

Chris Williams


1 Answers

This is a situation where #sort_by could simplify your code enormously:

event_participants = event_participants.sort_by do |s|
    if s =~ /'(\d+):(\d+)\.(\d+)/
        [ $1, $2, $3 ].map { |digits| digits.to_i } 
    else
        []
    end
end.reverse

Here, I parse the relevant times into an array of integers, and use those as a sorting key for the data. Array comparisons are done entry by entry, with the first being the most significant, so this works well.

One thing you don't do is convert the digits to integers, which you most likely want to do. Otherwise, you'll have issues with "100" < "2" #=> true. This is why I added the #map step.

Also, in your regex, the square brackets around \d are unnecessary, though you do want to escape the period so it doesn't match all characters.

One way the code I gave doesn't match the code you gave is in the situation where a line doesn't contain any distances. Your code will compare them as equal to surrounding lines (which may get you into trouble if the sorting algorithm assumes equality is transitive. That is a == b, b == c implies a ==c, which is not the case for your code : for example a = "'10:00.1", b = "frog", c="'9:99:9").

#sort_by sorts in ascending order, so the call to #reverse will change it into descending order. #sort_by also has the advantage of only parsing out the comparison values once, whereas your algorithm will have to parse each line for every comparison.

like image 190
rampion Avatar answered Oct 11 '22 17:10

rampion