I am create a simple app which can performance basic operations. SQLite is used as database. I want to perform wildcard search but I know that it has poor performance. I want to try out full text search but I cannot full a example on how to do it. I confirmed that SQLite has full text search support. Here is my sample code.
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
class Person(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Text, unique=True, nullable=False)
thumb = db.Column(db.Text, nullable=False, default="")
role = db.relationship("Role", backref="person", cascade="delete")
class Role(db.Model):
id = db.Column(db.Integer, primary_key=True)
person_id = db.Column(db.Integer, db.ForeignKey(Person.id, ondelete="CASCADE"), nullable=False)
role = db.Column(db.Text, nullable=False)
How can I create a FTS index and query it using SQLAlchemy. For example, searching name in Person.
FTS5 is an SQLite virtual table module that provides full-text search functionality to database applications. In their most elementary form, full-text search engines allow the user to efficiently search a large collection of documents for the subset that contain one or more instances of a search term.
SQLAlchemy is an SQL toolkit that provides efficient and high-performing database access for relational databases. It provides ways to interact with several database engines such as SQLite, MySQL, and PostgreSQL.
SQLAlchemy Index is used for assigning the identifiers for each of the particular row getting stored inside a table. We can have indexing based on the single column or collection of two or more columns together acting as an index to the table rows.
Since SQLAlchemy is integrated with Pandas, we can use its SQL connection directly with “con = conn”.
FTS5 provides virtual tables that support full-text search. In other words you cannot create a full-text index on a column in an existing table. Instead you can create an FTS5 virtual table and copy relevant data from your original table over for indexing. In order to avoid storing the same data twice you can make it an external content table, though you will still have to make sure that the FTS5 table is kept in sync, either manually or with triggers.
You could create a generic custom DDL construct that'd handle creating a FTS5 virtual table that mirrors another table:
class CreateFtsTable(DDLElement):
"""Represents a CREATE VIRTUAL TABLE ... USING fts5 statement, for indexing
a given table.
"""
def __init__(self, table, version=5):
self.table = table
self.version = version
@compiles(CreateFtsTable)
def compile_create_fts_table(element, compiler, **kw):
"""
"""
tbl = element.table
version = element.version
preparer = compiler.preparer
sql_compiler = compiler.sql_compiler
tbl_name = preparer.format_table(tbl)
vtbl_name = preparer.quote(tbl.name + "_idx")
text = "\nCREATE VIRTUAL TABLE "
text += vtbl_name + " "
text += "USING fts" + str(version) + "("
separator = "\n"
pk_column, = tbl.primary_key
columns = [col for col in tbl.columns if col is not pk_column]
for column in columns:
text += separator
separator = ", \n"
text += "\t" + preparer.format_column(column)
if not isinstance(column.type, String):
text += " UNINDEXED"
text += separator
text += "\tcontent=" + sql_compiler.render_literal_value(
tbl.name, String())
text += separator
text += "\tcontent_rowid=" + sql_compiler.render_literal_value(
pk_column.name, String())
text += "\n)\n\n"
return text
The given implementation is a bit naive and indexes all text columns by default. The created virtual table is implicitly named by adding _idx
after the original table name.
But that alone is not enough, if you want to automate keeping the tables in sync with triggers, and since you're adding an index for just one table, you could just opt to use text DDL constructs in your migration script:
def upgrade():
ddl = [
"""
CREATE VIRTUAL TABLE person_idx USING fts5(
name,
thumb UNINDEXED,
content='person',
content_rowid='id'
)
""",
"""
CREATE TRIGGER person_ai AFTER INSERT ON person BEGIN
INSERT INTO person_idx (rowid, name, thumb)
VALUES (new.id, new.name, new.thumb);
END
""",
"""
CREATE TRIGGER person_ad AFTER DELETE ON person BEGIN
INSERT INTO person_idx (person_idx, rowid, name, thumb)
VALUES ('delete', old.id, old.name, old.thumb);
END
""",
"""
CREATE TRIGGER person_au AFTER UPDATE ON person BEGIN
INSERT INTO person_idx (person_idx, rowid, name, thumb)
VALUES ('delete', old.id, old.name, old.thumb);
INSERT INTO person_idx (rowid, name, thumb)
VALUES (new.id, new.name, new.thumb);
END
"""
]
for stmt in ddl:
op.execute(sa.DDL(stmt))
If your person table contains existing data, remember to insert those to the created virtual table as well for indexing.
In order to actually use the created virtual table you could create a non-primary mapper for Person
:
person_idx = db.Table('person_idx', db.metadata,
db.Column('rowid', db.Integer(), primary_key=True),
db.Column('name', db.Text()),
db.Column('thumb', db.Text()))
PersonIdx = db.mapper(
Person, person_idx, non_primary=True,
properties={
'id': person_idx.c.rowid
}
)
And to make a full-text query using for example MATCH:
db.session.query(PersonIdx).\
filter(PersonIdx.c.name.op("MATCH")("john")).\
all()
Note that the result is a list of Person
objects. PersonIdx
is just a Mapper
.
As noted by Victor K. the use of non-primary mappers is deprecated and the new alternative is to use aliased()
. The setup is mostly the same, but the rowid
to id
mapping needs to take place when creating person_idx
Table
using the key
parameter of Column
:
person_idx = db.Table('person_idx', db.metadata,
db.Column('rowid', db.Integer(), key='id', primary_key=True),
db.Column('name', db.Text()),
db.Column('thumb', db.Text()))
and instead of a new mapper create the alias:
PersonIdx = db.aliased(Person, person_idx, adapt_on_names=True)
The alias works more like the mapped class in that you do not access mapped attributes through .c
, but directly:
db.session.query(PersonIdx).\
filter(PersonIdx.name.op("MATCH")("john")).\
all()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With