Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Moving data from sqlalchemy to a pandas DataFrame

I am trying to load an SQLAlchemy in a pandas DataFrame.

When I do:

df = pd.DataFrame(LPRRank.query.all())

I get

>>> df
0        <M. Misty || 1 || 18>
1        <P. Patch || 2 || 18>
...
...

But, what I want is each column in the database to be a column in the dataframe:

0        M. Misty  1  18
1        P. Patch  2  18
...
...

and when I try:

dff = pd.read_sql_query(LPRRank.query.all(), db.session())

I get an Attribute Error:

AttributeError: 'SignallingSession' object has no attribute 'cursor'

and

dff = pd.read_sql_query(LPRRank.query.all(), db.session)

also gives an error:

AttributeError: 'scoped_session' object has no attribute 'cursor'

What I'm using to generate the list of objects is:

app = Flask(__name__)
db = SQLAlchemy(app)

class LPRRank(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    candid = db.Column(db.String(40), index=True, unique=False)
    rank = db.Column(db.Integer, index=True, unique=False) 
    user_id = db.Column(db.Integer, db.ForeignKey('lprvote.id'))

    def __repr__(self):
        return '<{} || {} || {}>'.format(self.candid,
                                                 self.rank, self.user_id) 

This question: How to convert SQL Query result to PANDAS Data Structure? is error free, but gives each row as an object, which is not what I want. I can access the individual columns in the returned object, but its seems like there is a better way to do it.

The documentation at pandas.pydata.org is great if you already understand what is going on and just need to review syntax. The documentation from April 20, 2016 (the 1319 page pdf) identifies a pandas connection as still experimental on p.872.

Now, SQLALCHEMY/PANDAS - SQLAlchemy reading column as CLOB for Pandas to_sql is about specifying the SQL type. Mine is SQLAlchemy which is the default.

And, sqlalchemy pandas to_sql OperationalError, Writing to MySQL database with pandas using SQLAlchemy, to_sql, and SQLAlchemy/pandas to_sql for SQLServer -- CREATE TABLE in master db are about writing to the SQL database which produces an operational error, a database error, and a 'create table' error neither of which are my problems.

This one, SQLAlchemy Pandas read_sql from jsonb wants a jsonb attribute to columns: not my cup 'o tea.

This previous question SQLAlchemy ORM conversion to pandas DataFrame addresses my issue but the solution: using query.session.bind is not my solution. I'm opening /closing sessions with db.session.add(), and db.session.commit(), but when I use db.session.bind as specified in the second answer here, then I get an Attribute Error:

AttributeError: 'list' object has no attribute '_execute_on_connection'
like image 921
David Collins Avatar asked Mar 10 '18 23:03

David Collins


Video Answer


1 Answers

Simply add an __init__ method in your model and call the Class object before dataframe build. Specifically below creates an iterable of tuples binded into columns with pandas.DataFrame().

class LPRRank(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    candid = db.Column(db.String(40), index=True, unique=False)
    rank = db.Column(db.Integer, index=True, unique=False) 
    user_id = db.Column(db.Integer, db.ForeignKey('lprvote.id'))

    def __init__(self, candid=None, rank=None, user_id=None):
        self.data = (candid, rank, user_id)

    def __repr__(self):
        return (self.candid, self.rank, self.user_id) 

data = db.session.query(LPRRank).all()
df = pd.DataFrame([(d.candid, d.rank, d.user_id) for d in data], 
                  columns=['candid', 'rank', 'user_id'])

Alternatively, use the SQLAlchemy ORM based on your defined Model class, LPRRank, to run read_sql:

df = pd.read_sql(sql = db.session.query(LPRRank)\
                         .with_entities(LPRRank.candid,
                                        LPRRank.rank,
                                        LPRRank.user_id).statement, 
                 con = db.session.bind)
like image 172
Parfait Avatar answered Oct 10 '22 16:10

Parfait