Proper use of MySQL full text search with SQLAlchemy

Tags:

I would like to be able to full text search across several text fields of one of my SQLAlchemy mapped objects. I would also like my mapped object to support foreign keys and transactions.

I plan to use MySQL to run the full text search. However, I understand that MySQL can only run full text search on a MyISAM table, which does not support transactions and foreign keys.

In order to accomplish my objective I plan to create two tables. My code will look something like this:

class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True)
    name = Column(String(50))
    description = Column(Text)

users_myisam = Table('users_myisam', Base.metadata,
                     Column('id', Integer),
                     Column('name', String(50)),
                     Column('description', Text),
                     mysql_engine='MyISAM')

conn = Base.metadata.bind.connect()
conn.execute("CREATE FULLTEXT INDEX idx_users_ftxt \
              on users_myisam (name, description)")

Then, to search I will run this:

q = 'monkey'
ft_search = users_myisam.select("MATCH (name,description) AGAINST ('%s')" % q)
result = ft_search.execute()
for row in result: print row

This seems to work, but I have a few questions:

Is my approach of creating two tables to solve my problem reasonable? Is there a standard/better/cleaner way to do this?
Is there a SQLAlchemy way to create the fulltext index, or am I best to just directly execute "CREATE FULLTEXT INDEX ..." as I did above?
Looks like I have a SQL injection problem in my search/match against query. How can I do the select the "SQLAlchemy way" to fix this?
Is there a clean way to join the users_myisam select/match against right back to my user table and return actual User instances, since this is what I really want?
In order to keep my users_myisam table in sync with my mapped object user table, does it make sense for me to use a MapperExtension on my User class, and set the before_insert, before_update, and before_delete methods to update the users_myisam table appropriately, or is there some better way to accomplish this?

Thanks, Michael

510

asked Feb 20 '13 03:02

mickeybob

1 Answers

Is my approach of creating two tables to solve my problem reasonable? Is there a standard/better/cleaner way to do this?

I've not seen this use case attempted before, as developers who value transactions and constraints tend to use Postgresql in the first place. I understand that may not be possible in your specific scenario.

Is there a SQLAlchemy way to create the fulltext index, or am I best to just directly execute "CREATE FULLTEXT INDEX ..." as I did above?

conn.execute() is fine though if you want something slightly more integrated you can use the DDL() construct, read through http://docs.sqlalchemy.org/en/rel_0_8/core/schema.html?highlight=ddl#customizing-ddl for details

Looks like I have a SQL injection problem in my search/match against query. How can I do the select the "SQLAlchemy way" to fix this?

note: this recipe is only for MATCH against multiple columns simultaneously - if you have just one column, use the match() operator more simply.

most basically you could use the text() construct:

from sqlalchemy import text, bindparam

users_myisam.select(
  text("MATCH (name,description) AGAINST (:value)", 
       bindparams=[bindparam('value', q)])
)

more comprehensively you could define a custom construct:

from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import ClauseElement
from sqlalchemy import literal

class Match(ClauseElement):
    def __init__(self, columns, value):
        self.columns = columns
        self.value = literal(value)

@compiles(Match)
def _match(element, compiler, **kw):
    return "MATCH (%s) AGAINST (%s)" % (
               ", ".join(compiler.process(c, **kw) for c in element.columns),
               compiler.process(element.value)
             )

my_table.select(Match([my_table.c.a, my_table.c.b], "some value"))

docs:

http://docs.sqlalchemy.org/en/rel_0_8/core/compiler.html

Is there a clean way to join the users_myisam select/match against right back to my user table and return actual User instances, since this is what I really want?

you should probably create a UserMyISAM class, map it just like User, then use relationship() to link the two classes together, then simple operations like this are possible:

query(User).join(User.search_table).\
           filter(Match([UserSearch.x, UserSearch.y], "some value"))

In order to keep my users_myisam table in sync with my mapped object user table, does it make sense for me to use a MapperExtension on my User class, and set the before_insert, before_update, and before_delete methods to update the users_myisam table appropriately, or is there some better way to accomplish this?

MapperExtensions are deprecated, so you'd at least use the event API, and in most cases we want to try applying object mutations outside of the flush process. In this case, I'd be using the constructor for User, or alternatively the init event, as well as a basic @validates decorator which will receive values for the target attributes on User and copy those values into User.search_table.

Overall, if you've been learning SQLAlchemy from another source (like the Oreilly book), its really out of date by many years, and I'd be focusing on the current online documentation.

109

answered Sep 23 '22 03:09

zzzeek

Related questions
                            
                                How do you protect yourself from missing comma in vertical string list in python?
                            
                                python, unittest, test a script with command line args
                            
                                Text File data parsing lines and output as columns
                            
                                One to one self relationship in SQLAlchemy
                            
                                Parsing a multi-line data file with Python [closed]
                            
                                What does pix[x, y] mean in Python
                            
                                get errors when import lxml.etree to python
                            
                                Python: setup.py missing: No such file or directory
                            
                                How can I catch a system suspend event in Python?
                            
                                Creating a multiple phone vCard using vObject
                            
                                Constantly monitor a program/process using Python
                            
                                Can i divide the models in different files in django
                            
                                lambda function in sorted dictionary list comprehension
                            
                                Automatic headers when opening a new python file with vim [duplicate]
                            
                                This field is required error in django
                            
                                Split a series on time gaps in pandas?
                            
                                python unittest - Using 'buffer' option to suppress stdout - how do I do it?
                            
                                Using yield with multiple ndb.get_multi_async
                            
                                Setting up Python with WSGI on Apache for a directory
                            
                                Capturing emoticons using regular expression in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Proper use of MySQL full text search with SQLAlchemy

Tags:

python

mysql

full-text-search

sqlalchemy

mickeybob

People also ask

1 Answers

zzzeek

Recent Activity

Donate For Us