Pandas has a lovely to_sql
method for writing dataframes to any RDBMS supported by SQLAlchemy.
Say I have a dataframe generated thusly:
df = pd.DataFrame([-1.04, 0.70, 0.11, -0.43, 1.0], columns=['value'])
If I try to write it to the database without any special behavior, I get a column type of double precision:
df.to_sql('foo_test', an_engine)
If I wanted a different datatype, I could specify it (this works fine):
df.to_sql('foo_test', an_engine, dtype={'value': sqlalchemy.types.NUMERIC})
But if I want to set the precision and scale of the NUMERIC
column, it blows up in my face:
df.to_sql('foo_test', an_engine, dtype={'value': sqlalchemy.types.NUMERIC(10,2)})
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-77-dc008463fbfc> in <module>()
1 df = pd.DataFrame([-1.04, 0.70, 0.11, -0.43, 1.0], columns=['value'])
----> 2 df.to_sql('foo_test', cosd_engine, dtype={'value': sqlalchemy.types.NUMERIC(10,2)})
/Users/igazit/.virtualenvs/myproject/lib/python2.7/site-packages/pandas/core/generic.pyc in to_sql(self, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
964 self, name, con, flavor=flavor, schema=schema, if_exists=if_exists,
965 index=index, index_label=index_label, chunksize=chunksize,
--> 966 dtype=dtype)
967
968 def to_pickle(self, path):
/Users/igazit/.virtualenvs/myproject/lib/python2.7/site-packages/pandas/io/sql.pyc in to_sql(frame, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
536 pandas_sql.to_sql(frame, name, if_exists=if_exists, index=index,
537 index_label=index_label, schema=schema,
--> 538 chunksize=chunksize, dtype=dtype)
539
540
/Users/igazit/.virtualenvs/myproject/lib/python2.7/site-packages/pandas/io/sql.pyc in to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype)
1162 import sqlalchemy.sql.type_api as type_api
1163 for col, my_type in dtype.items():
-> 1164 if not issubclass(my_type, type_api.TypeEngine):
1165 raise ValueError('The type of %s is not a SQLAlchemy '
1166 'type ' % col)
TypeError: issubclass() arg 1 must be a class
I'm trying to dig into why the type for sqlalchemy.types.NUMERIC
passes the test on 1164, while sqlalchemy.types.NUMERIC(10,2)
doesn't. They do have different types (sqlalchemy.sql.visitors.VisitableType
vs sqlalchemy.sql.sqltypes.NUMERIC
).
Any clues would be much appreciated!
Update: this bug is fixed for pandas >= 0.16.0
This is post about a recent pandas bug with the same error with 0.15.2.
https://github.com/pydata/pandas/issues/9083
A Collaborator suggests a to_sql monkey patch as a way to solve it
from pandas.io.sql import SQLTable
def to_sql(self, frame, name, if_exists='fail', index=True,
index_label=None, schema=None, chunksize=None, dtype=None):
"""
patched version of https://github.com/pydata/pandas/blob/v0.15.2/pandas/io/sql.py#L1129
"""
if dtype is not None:
from sqlalchemy.types import to_instance, TypeEngine
for col, my_type in dtype.items():
if not isinstance(to_instance(my_type), TypeEngine):
raise ValueError('The type of %s is not a SQLAlchemy '
'type ' % col)
table = SQLTable(name, self, frame=frame, index=index,
if_exists=if_exists, index_label=index_label,
schema=schema, dtype=dtype)
table.create()
table.insert(chunksize)
# check for potentially case sensitivity issues (GH7815)
if name not in self.engine.table_names(schema=schema or self.meta.schema):
warnings.warn("The provided table name '{0}' is not found exactly "
"as such in the database after writing the table, "
"possibly due to case sensitivity issues. Consider "
"using lower case table names.".format(name), UserWarning)
pd.io.sql.SQLDatabase.to_sql = to_sql
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With