Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres accent insensitive LIKE search in Rails 3.1 on Heroku

How can I modify a where/like condition on a search query in Rails:

find(:all, :conditions => ["lower(name) LIKE ?", "%#{search.downcase}%"])

so that the results are matched irrespective of accents? (eg métro = metro). Because I'm using utf8, I can't use "to_ascii". Production is running on Heroku.

like image 207
user1051849 Avatar asked Feb 11 '12 19:02

user1051849


3 Answers

Proper solution

Since PostgreSQL 9.1 you can just:

CREATE EXTENSION unaccent;

Provides a function unaccent(), doing what you need (except for lower(), just use that additionally if needed). Read the manual about this extension.

More about unaccent and indexes:

  • Does PostgreSQL support "accent insensitive" collations?

Poor man's solution

If you can't install unacccent, but are able to create a function. I compiled the list starting here and added to it over time. It is comprehensive, but hardly complete:

CREATE OR REPLACE FUNCTION lower_unaccent(text)
  RETURNS text
  LANGUAGE sql IMMUTABLE STRICT AS
$func$
SELECT lower(translate($1
     , '¹²³áàâãäåāăąÀÁÂÃÄÅĀĂĄÆćčç©ĆČÇĐÐèéêёëēĕėęěÈÊËЁĒĔĖĘĚ€ğĞıìíîïìĩīĭÌÍÎÏЇÌĨĪĬłŁńňñŃŇÑòóôõöōŏőøÒÓÔÕÖŌŎŐØŒř®ŘšşșߊŞȘùúûüũūŭůÙÚÛÜŨŪŬŮýÿÝŸžżźŽŻŹ'
     , '123aaaaaaaaaaaaaaaaaaacccccccddeeeeeeeeeeeeeeeeeeeeggiiiiiiiiiiiiiiiiiillnnnnnnooooooooooooooooooorrrsssssssuuuuuuuuuuuuuuuuyyyyzzzzzz'
     ));
$func$;

Your query should work like that:

find(:all, :conditions => ["lower_unaccent(name) LIKE ?", "%#{search.downcase}%"])

For left-anchored searches, you can use an index on the function for very fast results:

CREATE INDEX tbl_name_lower_unaccent_idx
  ON fest (lower_unaccent(name) text_pattern_ops);

For queries like:

SELECT * FROM tbl WHERE (lower_unaccent(name)) LIKE 'bob%';

Or use COLLATE "C". See:

  • PostgreSQL LIKE query performance variations
  • Is there a difference between text_pattern_ops and COLLATE "C"?
like image 100
Erwin Brandstetter Avatar answered Sep 16 '22 18:09

Erwin Brandstetter


For those like me who are having trouble on add the unaccent extension for PostgreSQL and get it working with the Rails application, here is the migration you need to create:

class AddUnaccentExtension < ActiveRecord::Migration
  def up
    execute "create extension unaccent"
  end

  def down
    execute "drop extension unaccent"
  end
end

And, of course, after rake db:migrate you will be able to use the unaccent function in your queries: unaccent(column) similar to ... or unaccent(lower(column)) ...

like image 41
Edison Machado Avatar answered Sep 17 '22 18:09

Edison Machado


First of all, you install postgresql-contrib. Then you connect to your DB and execute:

CREATE EXTENSION unaccent;

to enable the extension for your DB.

Depending on your language, you might need to create a new rule file (in my case greek.rules, located in /usr/share/postgresql/9.1/tsearch_data), or just append to the existing unaccent.rules (quite straightforward).

In case you create your own .rules file, you need to make it default:

ALTER TEXT SEARCH DICTIONARY unaccent (RULES='greek');

This change is persistent, so you need not redo it.

The next step would be to add a method to a model to make use of this function.

One simple solution would be defining a function in the model. For instance:

class Model < ActiveRecord::Base
    [...]
    def self.unaccent(column,value)
        a=self.where('unaccent(?) LIKE ?', column, "%value%")
        a
    end
    [...]
end

Then, I can simply invoke:

Model.unaccent("name","text")

Invoking the same command without the model definition would be as plain as:

Model.where('unaccent(name) LIKE ?', "%text%"

Note: The above example has been tested and works for postgres9.1, Rails 4.0, Ruby 2.0.

UPDATE INFO
Fixed potential SQLi backdoor thanks to @Henrik N's feedback

like image 41
Ruby Racer Avatar answered Sep 17 '22 18:09

Ruby Racer