Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort an array of accented words in ruby

Tags:

sorting

ruby

How can I sort an array of accented words by the every letter in reference to variable alpha. The code below only reference alpha for the first letter so I am unable to get "ĝusti", "ĝusti vin","ĝuspa" to sort correctly.

I need the code to sort the words like this:

["bonan matenon", "ĉu vi parolas esperanton","ĝuspa", "ĝusti", "ĝusti vin",  "mi amas vin", "pacon"]

def alphabetize(phrases)
    alpha = "abcĉdefgĝhĥijĵklmnoprsŝtuŭvz".split(//)

    phrases.sort_by { |phrase|      
    alpha.index(phrase[0])  
    }
    end

alphabetize(["mi amas vin", "bonan matenon", "pacon", "ĉu vi parolas esperanton", "ĝusti", "ĝusti vin","ĝuspa"])
like image 321
hannaminx Avatar asked Feb 06 '15 15:02

hannaminx


2 Answers

You could use the i18n gem like this:

# encoding: UTF-8
require 'i18n'
I18n.enforce_available_locales = false

a = ["bonan matenon", "ĉu vi parolas esperanton","ĝuspa", "ĝusti", "ĝusti vin",  "mi amas vin", "pacon"]
b = a.sort_by { |e| I18n.transliterate e }
puts b

gives

bonan matenon
ĉu vi parolas esperanton
ĝuspa
ĝusti
ĝusti vin
mi amas vin
pacon
like image 57
peter Avatar answered Oct 05 '22 05:10

peter


The fix is quite obvious: instead of just returning the first character's index, map all characters to their respective index:

def alphabetize(phrases)
  alpha = "abcĉdefgĝhĥijĵklmnoprsŝtuŭvz".chars

  phrases.sort_by do |phrase|
    phrase.chars.map { |c| alpha.index(c) }
  end
end

puts alphabetize(["mi amas vin", "bonan matenon", "pacon", "ĉu vi parolas esperanton", "ĝusti", "ĝusti vin","ĝuspa"])

Output:

bonan matenon
ĉu vi parolas esperanton
ĝuspa
ĝusti
ĝusti vin
mi amas vin
pacon

To speed up index lookup, you could use a hash:

alpha = "abcĉdefgĝhĥijĵklmnoprsŝtuŭvz".each_char.with_index.to_h
#=> {"a"=>0, "b"=>1, "c"=>2, ..., "v"=>26, "z"=>27}

and call alpha[c] instead of alpha.index(c)

like image 45
Stefan Avatar answered Oct 05 '22 05:10

Stefan