I want to perform large number of queries to filter by tag, on a postgre table
from sqlalchemy.dialects.postgresql import ARRAY
class Post(db.Model):
__tablename__ = 'post'
id = db.Column(db.Integer, primary_key=True)
tags = db.Column(ARRAY(db.String))
This link recommends storing tags as text array with a GIN index.
How do I add GIN index to above table? Also does it make a difference whether I use String
vs Text
datatype?
PostgreSQL supports sequences, and SQLAlchemy uses these as the default means of creating new primary key values for integer-based primary key columns.
GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items.
The psycopg2 is over 2x faster than SQLAlchemy on small table. This behavior is expected as psycopg2 is a database driver for postgresql while SQLAlchemy is general ORM library.
SQLAlchemy Index is used for assigning the identifiers for each of the particular row getting stored inside a table. We can have indexing based on the single column or collection of two or more columns together acting as an index to the table rows.
I solved it by following:
from sqlalchemy.dialects.postgresql import ARRAY, array
class Post(db.Model):
__tablename__ = 'post'
id = db.Column(db.Integer, primary_key=True)
tags = db.Column(ARRAY(db.Text), nullable=False, default=db.cast(array([], type_=db.Text), ARRAY(db.Text)))
__table_args__ = (db.Index('ix_post_tags', tags, postgresql_using="gin"), )
And query simply by
db.session.query(Post).filter(Post.tags.contains([tag]))
Have to keep the array type to Text
and not String
otherwise some error happens
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With