Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy with PostgreSQL and Full Text Search

I'm using flask, sqlalchemy and flask-sqlalchemy. I want to create a full test search index in postgres with gin and to_tsvector. At the moment, I'm trying the following. I think its the closest I've got to what I'm trying to express, but doesn't work.

from sqlalchemy.ext.declarative import declared_attr
from sqlalchemy.schema import Index
from sqlalchemy.sql.expression import func

from app import db


class Post(db.Model):

    id = db.Column(db.Integer, primary_key=True)
    added = db.Column(db.DateTime, nullable=False)
    pub_date = db.Column(db.DateTime, nullable=True)
    content = db.Column(db.Text)

    @declared_attr
    def __table_args__(cls):
        return (Index('idx_content', func.to_tsvector("english", "content"), postgresql_using="gin"), )

This throws the following error...

Traceback (most recent call last):
  File "./manage.py", line 5, in <module>
    from app import app, db
  File "/vagrant/app/__init__.py", line 36, in <module>
    from pep.models import *
  File "/vagrant/pep/models.py", line 8, in <module>
    class Post(db.Model):
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/flask_sqlalchemy.py", line 477, in __init__
    DeclarativeMeta.__init__(self, name, bases, d)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/ext/declarative/api.py", line 48, in __init__
    _as_declarative(cls, classname, cls.__dict__)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/ext/declarative/base.py", line 222, in _as_declarative
    **table_kw)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 326, in __new__
    table._init(name, metadata, *args, **kw)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 393, in _init
    self._init_items(*args)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 63, in _init_items
    item._set_parent_with_dispatch(self)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/events.py", line 235, in _set_parent_with_dispatch
    self._set_parent(parent)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 2321, in _set_parent
    ColumnCollectionMixin._set_parent(self, table)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/schema.py", line 1978, in _set_parent
    self.columns.add(col)
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/sql/expression.py", line 2391, in add
    self[column.key] = column
  File "/home/vagrant/.virtualenvs/pep/local/lib/python2.7/site-packages/sqlalchemy/sql/expression.py", line 2211, in __getattr__
    key)
AttributeError: Neither 'Function' object nor 'Comparator' object has an attribute 'key'

I've also tried

return (Index('idx_content', "content", postgresql_using="gin"), )

However, it doesn't work as postgres (9.1 at least, as that's what I run) expects to_tsvector to be called. This line creates the SQL;

CREATE INDEX content_index ON post USING gin (content)

rather than what I want;

CREATE INDEX content_index ON post USING gin(to_tsvector('english', content))

I opened a ticket as I think this may be a bug/limitation. http://www.sqlalchemy.org/trac/ticket/2605

like image 237
d0ugal Avatar asked Nov 13 '12 12:11

d0ugal


2 Answers

For now I've added the following lines to do it manually, but I'd much rather the 'correct' SQLAlchemy approach if there is one.

create_index = DDL("CREATE INDEX idx_content ON pep USING gin(to_tsvector('english', content));")
event.listen(Pep.__table__, 'after_create', create_index.execute_if(dialect='postgresql'))

There was some interesting discussion on the SQLAlchemy bug tracker. It looks like this is a limitation of the current indexing definition. Basically, my requirement is to allow indexes to be expressions rather than just column names but that isn't currently supported. This ticket is tracking this feature request: http://www.sqlalchemy.org/trac/ticket/695 . However, this is waiting for a developer to take forward and do the work (and has been for a while).

like image 66
d0ugal Avatar answered Sep 22 '22 09:09

d0ugal


Ran across this old question as I was working on creating some single and multicolumn tsvector GIN indexes. For anyone that is looking for a simple way to create these indexes using string representations of the column names, here is one approach using the SQLAlchemy text() construct.

from sqlalchemy import Column, Index, Integer, String, text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql import func


Base = declarative_base()

def to_tsvector_ix(*columns):
    s = " || ' ' || ".join(columns)
    return func.to_tsvector('english', text(s))

class Example(Base):
    __tablename__ = 'examples'

    id = Column(Integer, primary_key=True)
    atext = Column(String)
    btext = Column(String)

    __table_args__ = (
        Index(
            'ix_examples_tsv',
            to_tsvector_ix('atext', 'btext'),
            postgresql_using='gin'
            ),
        )
like image 40
benvc Avatar answered Sep 24 '22 09:09

benvc