I am trying to load an SQLAlchemy in a pandas DataFrame.
When I do:
df = pd.DataFrame(LPRRank.query.all())
I get
>>> df
0 <M. Misty || 1 || 18>
1 <P. Patch || 2 || 18>
...
...
But, what I want is each column in the database to be a column in the dataframe:
0 M. Misty 1 18
1 P. Patch 2 18
...
...
and when I try:
dff = pd.read_sql_query(LPRRank.query.all(), db.session())
I get an Attribute Error:
AttributeError: 'SignallingSession' object has no attribute 'cursor'
and
dff = pd.read_sql_query(LPRRank.query.all(), db.session)
also gives an error:
AttributeError: 'scoped_session' object has no attribute 'cursor'
What I'm using to generate the list of objects is:
app = Flask(__name__)
db = SQLAlchemy(app)
class LPRRank(db.Model):
id = db.Column(db.Integer, primary_key=True)
candid = db.Column(db.String(40), index=True, unique=False)
rank = db.Column(db.Integer, index=True, unique=False)
user_id = db.Column(db.Integer, db.ForeignKey('lprvote.id'))
def __repr__(self):
return '<{} || {} || {}>'.format(self.candid,
self.rank, self.user_id)
This question: How to convert SQL Query result to PANDAS Data Structure? is error free, but gives each row as an object, which is not what I want. I can access the individual columns in the returned object, but its seems like there is a better way to do it.
The documentation at pandas.pydata.org is great if you already understand what is going on and just need to review syntax. The documentation from April 20, 2016 (the 1319 page pdf) identifies a pandas connection as still experimental on p.872.
Now, SQLALCHEMY/PANDAS - SQLAlchemy reading column as CLOB for Pandas to_sql is about specifying the SQL type. Mine is SQLAlchemy which is the default.
And, sqlalchemy pandas to_sql OperationalError, Writing to MySQL database with pandas using SQLAlchemy, to_sql, and SQLAlchemy/pandas to_sql for SQLServer -- CREATE TABLE in master db are about writing to the SQL database which produces an operational error, a database error, and a 'create table' error neither of which are my problems.
This one, SQLAlchemy Pandas read_sql from jsonb wants a jsonb
attribute to columns: not my cup 'o tea.
This previous question SQLAlchemy ORM conversion to pandas DataFrame addresses my issue but the solution: using query.session.bind
is not my solution. I'm opening /closing sessions with db.session.add(), and db.session.commit(), but when I use db.session.bind
as specified in the second answer here, then I get an Attribute Error:
AttributeError: 'list' object has no attribute '_execute_on_connection'
Simply add an __init__
method in your model and call the Class object before dataframe build. Specifically below creates an iterable of tuples binded into columns with pandas.DataFrame()
.
class LPRRank(db.Model):
id = db.Column(db.Integer, primary_key=True)
candid = db.Column(db.String(40), index=True, unique=False)
rank = db.Column(db.Integer, index=True, unique=False)
user_id = db.Column(db.Integer, db.ForeignKey('lprvote.id'))
def __init__(self, candid=None, rank=None, user_id=None):
self.data = (candid, rank, user_id)
def __repr__(self):
return (self.candid, self.rank, self.user_id)
data = db.session.query(LPRRank).all()
df = pd.DataFrame([(d.candid, d.rank, d.user_id) for d in data],
columns=['candid', 'rank', 'user_id'])
Alternatively, use the SQLAlchemy ORM based on your defined Model class, LPRRank, to run read_sql
:
df = pd.read_sql(sql = db.session.query(LPRRank)\
.with_entities(LPRRank.candid,
LPRRank.rank,
LPRRank.user_id).statement,
con = db.session.bind)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With