Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create GIN index on text array column in SQLAlchemy (with PostgreSQL and python)

I want to perform large number of queries to filter by tag, on a postgre table

from sqlalchemy.dialects.postgresql import ARRAY

class Post(db.Model):
    __tablename__ = 'post'

    id = db.Column(db.Integer, primary_key=True)
    tags = db.Column(ARRAY(db.String))

This link recommends storing tags as text array with a GIN index.

How do I add GIN index to above table? Also does it make a difference whether I use String vs Text datatype?

like image 298
kampta Avatar asked May 23 '16 11:05

kampta


People also ask

Can I use SQLAlchemy with PostgreSQL?

PostgreSQL supports sequences, and SQLAlchemy uses these as the default means of creating new primary key values for integer-based primary key columns.

What is gin index in PostgreSQL?

GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items.

Is psycopg2 faster than SQLAlchemy?

The psycopg2 is over 2x faster than SQLAlchemy on small table. This behavior is expected as psycopg2 is a database driver for postgresql while SQLAlchemy is general ORM library.

What does index mean in SQLAlchemy?

SQLAlchemy Index is used for assigning the identifiers for each of the particular row getting stored inside a table. We can have indexing based on the single column or collection of two or more columns together acting as an index to the table rows.


1 Answers

I solved it by following:

from sqlalchemy.dialects.postgresql import ARRAY, array

class Post(db.Model):
    __tablename__ = 'post'

    id = db.Column(db.Integer, primary_key=True)
    tags = db.Column(ARRAY(db.Text), nullable=False, default=db.cast(array([], type_=db.Text), ARRAY(db.Text)))
    __table_args__ = (db.Index('ix_post_tags', tags, postgresql_using="gin"), )

And query simply by

db.session.query(Post).filter(Post.tags.contains([tag]))

Have to keep the array type to Text and not String otherwise some error happens

like image 191
kampta Avatar answered Sep 18 '22 15:09

kampta