I would like to be able to full text search across several text fields of one of my SQLAlchemy mapped objects. I would also like my mapped object to support foreign keys and transactions.
I plan to use MySQL to run the full text search. However, I understand that MySQL can only run full text search on a MyISAM table, which does not support transactions and foreign keys.
In order to accomplish my objective I plan to create two tables. My code will look something like this:
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String(50))
description = Column(Text)
users_myisam = Table('users_myisam', Base.metadata,
Column('id', Integer),
Column('name', String(50)),
Column('description', Text),
mysql_engine='MyISAM')
conn = Base.metadata.bind.connect()
conn.execute("CREATE FULLTEXT INDEX idx_users_ftxt \
on users_myisam (name, description)")
Then, to search I will run this:
q = 'monkey'
ft_search = users_myisam.select("MATCH (name,description) AGAINST ('%s')" % q)
result = ft_search.execute()
for row in result: print row
This seems to work, but I have a few questions:
Is my approach of creating two tables to solve my problem reasonable? Is there a standard/better/cleaner way to do this?
Is there a SQLAlchemy way to create the fulltext index, or am I best to just directly execute "CREATE FULLTEXT INDEX ..." as I did above?
Looks like I have a SQL injection problem in my search/match against query. How can I do the select the "SQLAlchemy way" to fix this?
Is there a clean way to join the users_myisam select/match against right back to my user table and return actual User instances, since this is what I really want?
In order to keep my users_myisam table in sync with my mapped object user table, does it make sense for me to use a MapperExtension on my User class, and set the before_insert, before_update, and before_delete methods to update the users_myisam table appropriately, or is there some better way to accomplish this?
Thanks, Michael
The basic query format of full-text searches in MySQL should be similar to the following: SELECT * FROM table WHERE MATCH(column) AGAINST(“string” IN NATURAL LANGUAGE MODE); When MATCH() is used together with a WHERE clause, the rows are automatically sorted by the highest relevance first.
SQLAlchemy supports MySQL starting with version 5.0. 2 through modern releases, as well as all modern versions of MariaDB.
To perform a case-sensitive full-text search, use a case-sensitive or binary collation for the indexed columns. For example, a column that uses the utf8mb4 character set of can be assigned a collation of utf8mb4_0900_as_cs or utf8mb4_bin to make it case-sensitive for full-text searches.
Supported Databases. SQLAlchemy includes dialects for SQLite, Postgresql, MySQL, Oracle, MS-SQL, Firebird, Sybase and others, most of which support multiple DBAPIs.
Is my approach of creating two tables to solve my problem reasonable? Is there a standard/better/cleaner way to do this?
I've not seen this use case attempted before, as developers who value transactions and constraints tend to use Postgresql in the first place. I understand that may not be possible in your specific scenario.
Is there a SQLAlchemy way to create the fulltext index, or am I best to just directly execute "CREATE FULLTEXT INDEX ..." as I did above?
conn.execute() is fine though if you want something slightly more integrated you can use the DDL() construct, read through http://docs.sqlalchemy.org/en/rel_0_8/core/schema.html?highlight=ddl#customizing-ddl for details
Looks like I have a SQL injection problem in my search/match against query. How can I do the select the "SQLAlchemy way" to fix this?
note: this recipe is only for MATCH
against multiple columns simultaneously - if you have just one column, use the match() operator more simply.
most basically you could use the text() construct:
from sqlalchemy import text, bindparam
users_myisam.select(
text("MATCH (name,description) AGAINST (:value)",
bindparams=[bindparam('value', q)])
)
more comprehensively you could define a custom construct:
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import ClauseElement
from sqlalchemy import literal
class Match(ClauseElement):
def __init__(self, columns, value):
self.columns = columns
self.value = literal(value)
@compiles(Match)
def _match(element, compiler, **kw):
return "MATCH (%s) AGAINST (%s)" % (
", ".join(compiler.process(c, **kw) for c in element.columns),
compiler.process(element.value)
)
my_table.select(Match([my_table.c.a, my_table.c.b], "some value"))
docs:
http://docs.sqlalchemy.org/en/rel_0_8/core/compiler.html
Is there a clean way to join the users_myisam select/match against right back to my user table and return actual User instances, since this is what I really want?
you should probably create a UserMyISAM class, map it just like User, then use relationship() to link the two classes together, then simple operations like this are possible:
query(User).join(User.search_table).\
filter(Match([UserSearch.x, UserSearch.y], "some value"))
In order to keep my users_myisam table in sync with my mapped object user table, does it make sense for me to use a MapperExtension on my User class, and set the before_insert, before_update, and before_delete methods to update the users_myisam table appropriately, or is there some better way to accomplish this?
MapperExtensions are deprecated, so you'd at least use the event API, and in most cases we want to try applying object mutations outside of the flush process. In this case, I'd be using the constructor for User, or alternatively the init event, as well as a basic @validates decorator which will receive values for the target attributes on User and copy those values into User.search_table
.
Overall, if you've been learning SQLAlchemy from another source (like the Oreilly book), its really out of date by many years, and I'd be focusing on the current online documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With