Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy - copy schema and data of subquery to another database

I am trying to copy data from a subquery from postgres (from_engine) to sqlite database. I can achieve this for copying a table using following command:

smeta = MetaData(bind=from_engine)
table = Table(table_name, smeta, autoload=True)
table.metadata.create_all(to_engine)

However, I am not sure how to achieve the same for a subquery statement.

-Sandeep

Edit: Follow up on the answer. Once I have created the table I want to create a subquery stmt as follows:

table = Table("newtable", dest_metadata, *columns)
stmt = dest_session.query(table).subquery();

However, the last stmt ends up with error cursor.execute(statement, parameters) sqlalchemy.exc.ProgrammingError: (ProgrammingError) relation "newtable" does not exist LINE 3: FROM newtable) AS anon_1

like image 286
Sandeep Avatar asked Feb 14 '14 04:02

Sandeep


2 Answers

One way that works at least in some cases:

  1. Use column_descriptions of a query object to get some information about the columns in the result set.

  2. With that information you can build the schema to create the new table in the other database.

  3. Run the query in the source database and insert the results into the new table.

First of some setup for the example:

from sqlalchemy import create_engine, MetaData, 
from sqlalchemy import Column, Integer, String, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

# Engine to the database to query the data from
# (postgresql)
source_engine = create_engine('sqlite:///:memory:', echo=True)
SourceSession = sessionmaker(source_engine)

# Engine to the database to store the results in
# (sqlite)
dest_engine = create_engine('sqlite:///:memory:', echo=True)
DestSession = sessionmaker(dest_engine)

# Create some toy table and fills it with some data
Base = declarative_base()
class Pet(Base):
    __tablename__ = 'pets'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    race = Column(String)

Base.metadata.create_all(source_engine)
sourceSession = SourceSession()
sourceSession.add(Pet(name="Fido", race="cat"))
sourceSession.add(Pet(name="Ceasar", race="cat"))
sourceSession.add(Pet(name="Rex", race="dog"))
sourceSession.commit()

Now to the interesting bit:

# This is the query we want to persist in a new table:
query= sourceSession.query(Pet.name, Pet.race).filter_by(race='cat')

# Build the schema for the new table
# based on the columns that will be returned 
# by the query:
metadata = MetaData(bind=dest_engine)
columns = [Column(desc['name'], desc['type']) for desc in query.column_descriptions]
column_names = [desc['name'] for desc in query.column_descriptions]
table = Table("newtable", metadata, *columns)

# Create the new table in the destination database
table.create(dest_engine)

# Finally execute the query
destSession = DestSession()
for row in query:
    destSession.execute(table.insert(row))
destSession.commit()

There should be more efficient ways to do the last loop. But bulk-insert is another topic.

like image 116
Stefan Avatar answered Oct 16 '22 21:10

Stefan


You can also go through a pandas data frame. For example a method would use pandas.read_sql(query, source.connection) and df.to_sql(table_name, con=destination.connection).

like image 8
Paul Rougieux Avatar answered Oct 16 '22 20:10

Paul Rougieux