Considering my users can save data as "café" or "cafe", I need to be able to search on that fields with an accent-insensitive query. I've found https://github.com/djcoin/django-unaccent/, but I have no idea if it is possible to implement something similar on sqlalchemy. I'm using PostgreSQL, so if the solution is specific to this database is good to me. If it is generic solution, it is much much better. Thanks for your help.

First install the unaccess extension in PostgreSQL: <code>create extension unaccent;</code> Next, declare the SQL function <code>unaccent</code> in Python: <pre class="prettyprint"><code>from sqlalchemy.sql.functions import ReturnTypeFromArgs class unaccent(ReturnTypeFromArgs): pass </code></pre> and use it like this: <pre class="prettyprint"><code>for place in session.query(Place).filter(unaccent(Place.name) == "cafe").all(): print place.name </code></pre> Make sure you have the correct indexes if you have a large table, otherwise this will result in a full table scan.

SQLALCHEMY ignore accents on query

2 Answers

First install the unaccess extension in PostgreSQL: create extension unaccent;

Next, declare the SQL function unaccent in Python:

from sqlalchemy.sql.functions import ReturnTypeFromArgs

class unaccent(ReturnTypeFromArgs):
    pass

and use it like this:

for place in session.query(Place).filter(unaccent(Place.name) == "cafe").all():
    print place.name

Make sure you have the correct indexes if you have a large table, otherwise this will result in a full table scan.

answered Oct 01 '22 21:10

chlunde

A simple and database agnostic solution is to write the field(s) that can have accents twice, once with and once without accents. Then you can conduct your searches on the unaccented version.

To generate the unaccented vesrsion of a string you can use Unidecode.

To automatically assign the unaccented version to the database when a record is inserted or updated you can use the default and onupdate clauses in the Column definition. For example, using Flask-SQLAlchemy you could do something like this:

from unidecode import unidecode
def unaccent(context):
    return unidecode(context.current_parameters['some_string'])

class MyModel(db.Model):
    id = Column(db.Integer, primary_key=True)
    some_string = db.Column(db.String(128))
    some_string_unaccented = db.Column(db.String(128), default=unaccent, onupdate=unaccent, index=True)

Note how I only indexed the unaccented field, because that is the one on which the searches will be made.

Of course before you can search you also have to unaccent the value you are searching for. For example:

def search(text):
    return MyModel.query.filter_by(some_string_unaccented = unaccent(text)).all()

You can apply the same technique to full text search, if necessary.

answered Oct 01 '22 21:10

Miguel

Related questions
                            
                                Repeated POST request is causing error "socket.error: (99, 'Cannot assign requested address')"
                            
                                Issue with sys.exit() in pygame
                            
                                Check if key exists in dictionary. If not, append it
                            
                                Collapse run-on whitespace
                            
                                remove last element in a dictionary of lists in python
                            
                                How to Reverse Hebrew String in Python?
                            
                                Test if an index of a list exists
                            
                                Python Unicode Encoding
                            
                                python execute many with "on duplicate key update"?
                            
                                Set Console Width (Windows, Python)
                            
                                Pythonic way to read file line by line?
                            
                                Pyramid route matching and query parameters
                            
                                cannot import name LOOKUP_SEP
                            
                                How to find files with specific case-insensitive extension names in Python [duplicate]
                            
                                Detecting a Specific Watermark in a Photo with Python (without SciPy)
                            
                                Matplotlib streamplot arrows pointing the wrong way
                            
                                ().is_integer() not working
                            
                                Flask: request.json works but request.get_json() causes error code 500
                            
                                Regex Match for Domain Name in Django Model
                            
                                numpy matrix multiplication shapes [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SQLALCHEMY ignore accents on query

Tags:

python

flask

diacritics

sqlalchemy

guinunez

People also ask

2 Answers

chlunde

Miguel

Recent Activity

Donate For Us