Schema versioning within the Snowflake data warehouse

Tags:

snowflake-cloud-data-platform

I am interested in ways in which users of snowflake database can be insulated from change via the use of schema versioning. I have been investigating the use of connection syntax to define a schema where a new schema holding views to the core tables would be created for each release, any views unchanged would be copied others which were amended would be made backwards compatible. As users connect they would ideally be given the correct connection syntax for the version they required.

The problem I have is that there are multiple teams each owning schemas associated with a core business area and I don't think it is possible to define multiple schemas in the connection syntax.

Has anyone achieved this in an environment with multiple users, schemas and development teams?

Regards,

Luke

704

asked Dec 10 '19 16:12

LukeM

1 Answers

We use Alembic for our Database version control for Snowflake. Alembic is a "migration" tool where you can run multiple changes (or a migration) to your Data Warehouse. It's essentially an add-on to the SQLAlchemy library in Python.

When developing locally, we create a clone of our database, and test our migration changes to the cloned database. Once we know it works, we push it to GitLab, get it approved, then we can run a CI/CD pipeline that has accountadmin credentials to make the change in production.

Since it's written in Python, you can connect this to your Git tool (like GitHub or GitLab) and submit changes in a Merge Request and get approval before running this in your Production database.

Here's the documentation: https://alembic.sqlalchemy.org/en/latest/

This is also officially supported according to Snowflake documentation: https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html#alembic-support

An example Alembic migration might look like:


Revision ID: 78a3acc7fbb2
Revises: 3f2ee8d809a6
Create Date: 2019-11-06 11:40:38.438468

"""

# revision identifiers, used by Alembic.
revision = '78a3acc7fbb2'
down_revision = '3f2ee8d809a6'
branch_labels = None
depends_on = None

from alembic import op
import sqlalchemy as sa

def upgrade():
    op.create_table('test_table',
    sa.Column('op', sa.String(length=255), nullable=True),
    sa.Column('id', sa.String(length=255), nullable=False),
    sa.Column('amount', sa.BigInteger(), nullable=True),
    sa.Column('reason', sa.String(length=255), nullable=True),
    sa.Column('deleted', sa.Boolean(), nullable=True),
    sa.Column('user_id', sa.Integer(), nullable=True),
    sa.Column('company_id', sa.Integer(), nullable=True),
    sa.Column('inserted_at', sa.DateTime(), nullable=True),
    sa.Column('updated_at', sa.DateTime(), nullable=True),
    sa.Column('dw_import_filename', sa.String(length=255), nullable=True),
    sa.Column('dw_import_file_row_number', sa.Integer(), nullable=True),
    sa.Column('dw_import_timestamp', sa.TIMESTAMP(), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    schema='test_schema'
    )

def downgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_table('test_table', schema='test_schema')

As you can see, you have to supply an upgrade and have the ability to downgrade, which reverses the upgrade. If you have any other questions about Alembic or if this interests you then I'd be happy to explain more.

162

answered Dec 25 '22 14:12

Brock

Related questions
                            
                                Parsing JSON list in Snowflake - converting redshift sql to snowflake sql
                            
                                Is SQL floating point sum affected by the order-by clause?
                            
                                Python Pandas: Merge Columns of Data Frame with column name into one column
                            
                                Get identity of row inserted in Snowflake Datawarehouse
                            
                                How do I list all of a user's roles in snowflake DB?
                            
                                MWAA Airflow 2.0 in AWS Snowflake connection not showing
                            
                                How to consume continuous streaming data in Snowflake connector for KAFKA [closed]
                            
                                Snowflake removing backslashes during Procedure compilation?
                            
                                How to run query with parameters in DataGrip?
                            
                                Add identity column to existing table in Snowflake?
                            
                                Snowflake subquery
                            
                                Query to get the row counts of all the tables in a database in Snowflake
                            
                                Delete statement fails when called from SSIS
                            
                                airflow plugins not getting picked up correctly
                            
                                Significance of Constraints in Snowflake
                            
                                How do you install dplyr-snowflakedb and rJava on Amazon Linux?
                            
                                Query to Snowflake database isn't working because no active warehouse is selected
                            
                                How snowflake internally performs updates?
                            
                                Materialized View vs Table Using dbt
                            
                                Snowflake pandas pd_writer writes out tables with NULLs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With