Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a table using SQLAlchemy, but defer the creation of indexes until the data is loaded

I have a python file which uses SQLAlchemy to define all the tables in a given database, including all the applicable indexes and foreign key constraints. The file looks something like this:

Base = declarative_base()

class FirstLevel(Base):
    __tablename__ = 'first_level'
    first_level_id = Column(Integer, index=True, nullable=False, primary_key=True, autoincrement=True)
    first_level_col1 = Column(String(100), index=True)
    first_level_col2 = Column(String(100))
    first_level_col3 = Column(String(100))

class SecondLevel(Base):
    __tablename__ = 'second_level'
    second_level_id = Column(Integer, index=True, nullable=False, primary_key=True, autoincrement=True)
    first_level_id = Column(None, ForeignKey(FirstLevel.first_level_id, onupdate='cascade', ondelete='cascade', deferrable=True), index=True, nullable=False)
    second_level_col1 = Column(String(100), index=True)
    second_level_col2 = Column(String(100))
    second_level_col3 = Column(String(100))

class ThirdLevel(Base):
    __tablename__ = 'third_level'
    third_level_id = Column(Integer, index=True, nullable=False, primary_key=True, autoincrement=True)
    first_level_id = Column(None, ForeignKey(FirstLevel.first_level_id, onupdate='cascade', ondelete='cascade', deferrable=True), index=True, nullable=False)
    second_level_id = Column(None, ForeignKey(SecondLevel.second_level_id, onupdate='cascade', ondelete='cascade', deferrable=True), index=True, nullable=False)
    third_level_col1 = Column(String(100), index=True)
    third_level_col2 = Column(String(100))
    third_level_col3 = Column(String(100))

...

I can use this file to create a new schema in the postgres database by executing the following command:

engine = create_engine('postgresql://username:password@path_to_database')
Base.metadata.create_all(engine)

The problem is that I have to load a huge amount of data into this newly-created database, and this takes a long long time if I don't remove the indexes and foreign key constraints. But manually removing and manually recreating them after I am done inserting all the data is a big hassle and removes most of the convenience of using SQLAlchemy to create a database schema.

I was wondering if there is a way to use SQLAlchemy to first create the tables in the database, load the data, and then use SQLAlchemy ORM again to create all the indexes and foreign key constraints?

like image 682
ostrokach Avatar asked Oct 20 '22 03:10

ostrokach


1 Answers

You could do it with Alembic migration scripts.

  1. Create initial tables / drop existing indexes
  2. Load data
  3. Add indexes
like image 165
plaes Avatar answered Oct 27 '22 07:10

plaes