Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

alembic generation of materialized view

TL;DR: How do I get alembic to understand and generate SQL for materialized views created in sqlalchemy?

I'm using flask-sqlalchemy and also using alembic with postgres. To get a materialized view working with sqlalchemy, I followed a nice post on the topic. I used it heavily, with just a few minor divergences (the article uses flask-sqlalchemy as well, however the complete code example uses sqlalchemy's declarative base directly instead).

class ActivityView(db.Model):
    __table__ = create_materialized_view(
        'activity_view',
        db.select([
            Activity.id.label('id'),
            Activity.name.label('name'),
            Activity.start_date.label('start_date'),
        ]).where(
            db.and_(
                Activity.start_date != None,
                Activity.start_date <=
                    datetime_to_str(datetime.now(tz=pytz.UTC) + timedelta(hours=48))
            )
        )
    )

    @classmethod
    def refresh(cls, concurrently=True):
        refresh_materialized_view(cls.__table__.fullname, concurrently)

db.Index('activity_view_index',
         ActivityView.__table__.c.id, ActivityView.__table__.c.start_date,
         unique=True)

The create_materialized_view and refresh_materialized_view methods are taken straight from the blog post.

Note that the example above has been greatly simplified and probably seems silly because of my simplifications, but the real idea I want to get at is how do I get alembic to translate this view to a series of alembic operations during a migration?

When I run tests, the code runs fine, the view gets generated fine and everything works. When alembic runs it doesn't do anything with the view. So what I end up doing is copying the SQL that the tests emit for the materialized view into the alembic migrations/versions file and just end up executing that directly as:

op.execute(activities_view_sql)

Similarly, I do the same direct SQL execution when generating the unique index on the materialized view.

Unfortunately my approach is error prone and creates seemingly unnecessary code duplication.

Is there a way to get alembic to understand my ActivityView so that any time it changes, alembic will know how to update the view?

Thanks much!

like image 426
lostdorje Avatar asked Apr 26 '16 04:04

lostdorje


2 Answers

TLDR: Just write the view migration manually. It doesn't seem there is reasonable support for view autogeneration.

I think the easiest way around this is to not rely on Alembic to autogenerate the view for you. Instead, you can instruct it to ignore views like this, in your Alembic env.py:

def include_object(obj, name, type_, reflected, compare_to):
    if obj.info.get("is_view", False):
        return False

    return True

...

def run_migrations_offline():
...
    context.configure(url=url, target_metadata=target_metadata, literal_binds=True, include_object=include_object)

...

def run_migrations_online():
    ....
    with connectable.connect() as connection:
        context.configure(connection=connection, target_metadata=target_metadata, include_object=include_object)

The is_view flag is set by my custom View base class:

class View(Model):
    @classmethod
    def _init_table(cls, sub_cls):
        table: sa.Table = Model._init_table(sub_cls)

        if table is None:
            return table

        table.info["is_view"] = True
        return table

When automatic generation ignores the view, you can then add the appropriate commands to your migration manually:

activities = table(
    "activities",
    sa.Column("id", sa.Integer()),
    ...
)

view_query = (
    select(
        [
            activities.c.id,
        ]
    )
    .select_from(activities)
)

def upgrade():
    view_query_string = str(view_query.compile(compile_kwargs={"literal_binds": True}))

    op.execute(f"CREATE VIEW activity_view AS {view_query_string}")

def downgrade():
    op.execute("DROP VIEW activity_view")

Two important points:

  1. code duplication is not always a bad thing - you can think of migrations as more of a version control tool, than regular code. Your version history should not depend on the current state of the codebase
  2. manually-written migrations are arguably more error prone than generated ones, but you can partially alleviate this by running your migrations in tests for production applications. Also, just inspecting the resulting DB schema should help.
like image 70
Teyras Avatar answered Nov 03 '22 17:11

Teyras


Although the question doesn't specifically call out using PostgreSQL, the post that it is based on targeted materialized views in PostgeSQL, so this answer also targets an add on package called alembic_utils, which is based on alembic ReplaceableObjects that adds support for autogenerating a larger number of PostgreSQL entity types, including functions, views, materialized views, triggers, and policies.

To setup, you create your materialized view in the following way;

from alembic_utils.pg_materialized_view import PGMaterializedView

actview = PGMaterializedView (
    schema="public",
    signature="activity_view",
    definition="select ...",
    with_data=True
)

You could base the definition off static SQL or compiled version of sqlalchemy code.

Then in your alembic env.py:

from foo import actview
from alembic_utils.replaceable_entity import register_entities

register_entities([actview])

Alembic will now autogenerate migrations when the materialized view is updated in code.

like image 1
Peter Henry Avatar answered Nov 03 '22 17:11

Peter Henry