I want to drop a table (if it exists) before writing some data in a Pandas dataframe:
def store_sqlite(in_data, dbpath = 'my.db', table = 'mytab'):
database = sqlalchemy.create_engine('sqlite:///' + dbpath)
## DROP TABLE HERE
in_data.to_sql(name = table, con = database, if_exists = 'append')
database.close()
The SQLAlchemy documentation all points to a Table.drop()
object - how would I create that object, or equivalently is there an alternative way to drop this table?
Note : I can't just use if_exists = 'replace'
as the input data is actually a dict of DataFrames which I loop over - I've suppressed that code for clarity (I hope).
From the panda docs;
"You can also run a plain query without creating a dataframe with execute(). This is useful for queries that don’t return values, such as INSERT. This is functionally equivalent to calling execute on the SQLAlchemy engine or db connection object."
http://pandas.pydata.org/pandas-docs/version/0.18.0/io.html#id3
So I do this;
from pandas.io import sql
sql.execute('DROP TABLE IF EXISTS %s'%table, engine)
sql.execute('VACUUM', engine)
Where "engine" is the SQLAlchemy database object (the OP's "database" above). Vacuum is optional, just reduces the size of the sqlite file (I use the table drop part infrequently in my code).
You should be able to create a cursor from your SQLAlchemy engine
import sqlalchemy
engine = sqlalchemy.create_engine('sqlite:///' + dbpath)
connection = engine.raw_connection()
cursor = connection.cursor()
command = "DROP TABLE IF EXISTS {};".format(table)
cursor.execute(command)
connection.commit()
cursor.close()
# Now you can chunk upload your data as you wish
in_data.to_sql(name=table, con=engine, if_exists='append')
If you're loading a lot of data into your db, you may find it faster to use pandas' to_csv()
and SQL's copy_from
function. You can also use StringIO()
to hold it in memory and having to write the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With