Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using multiple POSTGRES databases and schemas with the same Flask-SQLAlchemy model

I'm going to be very specific here, because similar questions have been asked, but none of the solutions work for this problem.

I'm working on a project that has four postgres databases, but let's say for the sake of simplicity there are 2. Namely, A & B

A,B represent two geographical locations, but the tables and schemas in the database are identical.

Sample model:

from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base

db = SQLAlchemy()
Base = declarative_base()

class FRARecord(Base):
    __tablename__ = 'tb_fra_credentials'

    recnr = Column(db.Integer, primary_key = True)
    fra_code = Column(db.Integer)
    fra_first_name = Column(db.String)

This model is replicated in both databases, but with different schemas, so to make it work in A, I need to do:

__table_args__ = {'schema' : 'A_schema'}

I'd like to use a single content provider that is given the database to access, but has identical methods:

class ContentProvider():
    def __init__(self, database):
        self.database = database

    def get_fra_list():
        logging.debug("Fetching fra list")
        fra_list = db.session.query(FRARecord.fra_code)

Two problems are, how do I decide what db to point to and how do I not replicate the model code for different schemas (this is a postgres specific problem)

Here's what I've tried so far:

1) I've made separate files for each of the models and inherited them, so:

class FRARecordA(FRARecord):
    __table_args__ = {'schema' : 'A_schema'}

This doesn't seem to work, because I get the error:

"Can't place __table_args__ on an inherited class with no table."

Meaning that I can't set that argument after the db.Model (in its parent) was already declared

2) So I tried to do the same with multiple inheritance,

class FRARecord():
    recnr = Column(db.Integer, primary_key = True)
    fra_code = Column(db.Integer)
    fra_first_name = Column(db.String)

class FRARecordA(Base, FRARecord):
    __tablename__ = 'tb_fra_credentials'
    __table_args__ = {'schema' : 'A_schema'}

but got the predictable error:

"CompileError: Cannot compile Column object until its 'name' is assigned."

Obviously I can't move the Column objects to the FRARecordA model without having to repeat them for B as well (and there are actually 4 databases and a lot more models).

3) Finally, I'm considering doing some sort of sharding (which seems to be the correct approach), but I can't find an example of how I'd go about this. My feeling is that I'd just use a single object like this:

class FRARecord(Base):
    __tablename__ = 'tb_fra_credentials'

    @declared_attr
    def __table_args__(cls):
        #something where I go through the values in bind keys like
        for key, value in self.db.app.config['SQLALCHEMY_BINDS'].iteritems():
            # Return based on current session maybe? And then have different sessions in the content provider?

    recnr = Column(db.Integer, primary_key = True)
    fra_code = Column(db.Integer)
    fra_first_name = Column(db.String)

Just to be clear, my intention for accessing the different databases was as follows:

app.config['SQLALCHEMY_DATABASE_URI']='postgresql://%(user)s:\
%(pw)s@%(host)s:%(port)s/%(db)s' % POSTGRES_A

app.config['SQLALCHEMY_BINDS']={'B':'postgresql://%(user)s:%(pw)s@%(host)s:%(port)s/%(db)s' % POSTGRES_B,
                                  'C':'postgresql://%(user)s:%(pw)s@%(host)s:%(port)s/%(db)s' % POSTGRES_C,
                                  'D':'postgresql://%(user)s:%(pw)s@%(host)s:%(port)s/%(db)s' % POSTGRES_D
                                 }

Where the POSTGRES dictionaries contained all the keys to connect to the data

I assumed with the inherited objects, I'd just connect to the correct one like this (so the sqlalchemy query would automatically know):

class FRARecordB(FRARecord):
    __bind_key__ = 'B'
    __table_args__ = {'schema' : 'B_schema'}
like image 865
Akhil Cherian Verghese Avatar asked Oct 17 '22 15:10

Akhil Cherian Verghese


1 Answers

Finally found a solution to this.

Essentially, I didn't create new classes for each database, I just used different database connections for each.

This method on its own is pretty common, the tricky part (which I couldn't find examples of) was handling schema differences. I ended up doing this:

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

Session = sessionmaker()

class ContentProvider():

    db = None
    connection = None
    session = None

    def __init__(self, center):
        if center == A:
            self.db = create_engine('postgresql://%(user)s:%(pw)s@%(host)s:%(port)s/%(db)s' % POSTGRES_A, echo=echo, pool_threadlocal=True)
            self.connection = self.db.connect()
            # It's not very clean, but this was the extra step. You could also set specific connection params if you have multiple schemas
            self.connection.execute('set search_path=A_schema')
        elif center == B:
            self.db = create_engine('postgresql://%(user)s:%(pw)s@%(host)s:%(port)s/%(db)s' % POSTGRES_B, echo=echo, pool_threadlocal=True)
            self.connection = self.db.connect()
            self.connection.execute('set search_path=B_schema')

    def get_fra_list(self):
        logging.debug("Fetching fra list")
        fra_list = self.session.query(FRARecord.fra_code)
        return fra_list
like image 129
Akhil Cherian Verghese Avatar answered Oct 20 '22 17:10

Akhil Cherian Verghese