Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

postgres unaccent function vs RoR transliterate

In our RoR project we use postgres unaccent function to retrieve unaccent version of one of our models name attribute. The name attribute can contain any accented characters from various languages. We then save it as unaccent_name attribute. I don't like this solution because we need to be sure to have installed and accessible postgres extension UNACCENT (when testing, moving/cleaning database, and so on).

In RoR there is ActiveSupport::Inflector.transliterate method, which should do something very similar.

I've found that it mostly translates accented characters the same way, but there is also some difference:

same result:

SELECT unaccent('ľščťžý') AS unaccent_name;
=> "lsctzy"
ActiveSupport::Inflector.transliterate('ľščťžý')
=> "lsctzy"

different result:

SELECT unaccent('ß') AS unaccent_name;
=> "S"
ActiveSupport::Inflector.transliterate('ß')
=> "ss"

I know both of these methods can accept dictionaries with custom letter replacements, but I'm only interested in their general/default usage.

Is the main purpose of transliterate method same as postgres unaccent function? Can we use it as a replacement?

like image 781
Rubow Avatar asked Oct 30 '22 20:10

Rubow


1 Answers

Very old post but I am working through a problem similar to the OP. We want to be able to search for a name and transliterate to give better results. However, with our versions of Postgres and rails the character transliterates the same to 'ss'.

Just wanted to share my findings in case it may be useful to others who stumble across this post.

In rails 5.2:

irb(main):001:0> ActiveSupport::Inflector.transliterate('ß')
=> "ss"

In postgres 9.6 I get:

db-test=# SELECT unaccent('ß') AS unaccent_name;
 unaccent_name 
---------------
 ss
(1 row)
like image 171
blindMoe Avatar answered Nov 15 '22 05:11

blindMoe