Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to have Accent-insensitive filter in django with postgres?

Hi I find that on postgres database, we can't configure default accent sensivity (on old mail exchanges).

Is there a way to have a _icontains also insensitive to special caracters (é, è, à, ç, ï) or I must use postgres regex to replace both side with _iregex (ç->c, é->e ...)?

edit: this question is old, and is kept for users of django before 1.8. For those using latest django versions, here the new way: https://docs.djangoproject.com/en/dev/ref/contrib/postgres/lookups/#std:fieldlookup-unaccent

like image 954
christophe31 Avatar asked Apr 11 '11 10:04

christophe31


2 Answers

EDIT: Django 1.8 makes accent unsensitive lookup for postgresql builtin. https://docs.djangoproject.com/en/dev/ref/contrib/postgres/lookups/#std:fieldlookup-unaccent

In fact in postgres contrib (8.4+) there is an unaccent function to search easily:

for postgres 9/8.5:

  • https://github.com/adunstan/postgresql-dev/commits/master/contrib/unaccent
  • http://www.sai.msu.su/~megera/wiki/unaccent

for postgres 8.4:

  • https://launchpad.net/postgresql-unaccent

here an example of usage from django:

vals = MyObject.objects.raw(
        "SELECT * \
         FROM myapp_myobject \
         WHERE unaccent(name) LIKE \'%"+search_text+"%'")

You may apply apply unaccent on text-search before comparison.

Option I made is:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# parts of credits comes to clarisys.fr
from django.db.backends.postgresql_psycopg2.base import *

class DatabaseOperations(DatabaseOperations):
    def lookup_cast(self, lookup_type):
        if lookup_type in('icontains', 'istartswith'):
            return "UPPER(unaccent(%s::text))"
        else:
            return super(DatabaseOperations, self).lookup_cast(lookup_type)

class DatabaseWrapper(DatabaseWrapper):
    def __init__(self, *args, **kwargs):
        super(DatabaseWrapper, self).__init__(*args, **kwargs)
        self.operators['icontains'] = 'LIKE UPPER(unaccent(%s))'
        self.operators['istartswith'] = 'LIKE UPPER(unaccent(%s))'
        self.ops = DatabaseOperations(self)

Use this file base.py in a folder and use this folder as db backend. icontains and istartswith are now case and accent insensitive.

like image 184
christophe31 Avatar answered Sep 22 '22 13:09

christophe31


I managed to install unaccent from postgresql contrib, but this answer that patches django didn't work. load_backend on django.db.utils enforces that the backend name starts with django.db.backends.

The solution that worked for me was inserting this code in one of my modules:

from django.db.backends.postgresql_psycopg2.base import DatabaseOperations, DatabaseWrapper

def lookup_cast(self, lookup_type, internal_type=None):
    if lookup_type in('icontains', 'istartswith'):
        return "UPPER(unaccent(%s::text))"
    else:
        return super(DatabaseOperations, self).lookup_cast(lookup_type, internal_type)

def patch_unaccent():
    DatabaseOperations.lookup_cast = lookup_cast
    DatabaseWrapper.operators['icontains'] = 'LIKE UPPER(unaccent(%s))'
    DatabaseWrapper.operators['istartswith'] = 'LIKE UPPER(unaccent(%s))'
    print 'Unaccent patch'

patch_unaccent()

Now unaccent searches are working fine, even inside django admin! Thanks for your answer above!

like image 23
bbrik Avatar answered Sep 25 '22 13:09

bbrik