Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy & pandas: How to query columns with a different label?

When using pandas read_sql to query my database using SQLAlchemy, I get the following warning:

SAWarning: Column 'id' on table being replaced by Column('id', Integer(), table=, primary_key=True, nullable=False), which has the same key. Consider use_labels for select() statements. (key, getattr(existing, 'table', None), value))

Right, so each of my League, Season, Round, Match, and Team tables have a column named id. Of course.

I ignored this at first, but this came to bite me in the buttocks when I wanted to delete one of the columns with the id label/name, using pd.drop(). Even pd.rename referencing the column by its index (!) renamed all columns with the same name:

pandoc.rename(
    columns={pandoc.columns[1]: 'match_id'}, 
    inplace=True)
# This replaced all columns with the label `id` to `match_id`

SQLAlchemy advises that I use consider use_labels for select() statements, and while I managed to do with with regular queries, I can't really figure out where to stick .label('new_column_name') in the following query:

pandoc = pd.read_sql(
Match.query.options(
        joinedload(Match.home_team),
        joinedload(Match.away_team)).statement, 
    db.session.bind,
    parse_dates=['date_time'])

One option is to change all id columns in my tables to tablename_id, but that seems like an ugly workaround for a problem that should have a fairly simple solution.

Sample output from print(pandoc.head()):

   total_goals  id               round_id  \
0          1.0  somestring here  s12786-0   
1          0.0  somestring here  s12786-0   
2          5.0  somestring here  s12786-0   
3          3.0  somestring here  s12786-0   
4          0.0  somestring here  s12786-0   

   home_team_id  away_team_id   id   id  
0           667           664  667  664  
1           669           691  669  691  
2           672           677  672  677  
3           707           686  707  686  
4           699           703  699  703

Notice the 3 id columns, one is the match ID, the other two are the home team ID and the away team ID.

like image 873
zerohedge Avatar asked Dec 31 '16 20:12

zerohedge


People also ask

What is SQLAlchemy used for?

SQLAlchemy is a library that facilitates the communication between Python programs and databases. Most of the times, this library is used as an Object Relational Mapper (ORM) tool that translates Python classes to tables on relational databases and automatically converts function calls to SQL statements.

Is SQLAlchemy same as MySQL?

SQLAlchemy provides a nice “Pythonic” way of interacting with databases. So rather than dealing with the differences between specific dialects of traditional SQL such as MySQL or PostgreSQL or Oracle, you can leverage the Pythonic framework of SQLAlchemy to streamline your workflow and more efficiently query your data.

Is SQLAlchemy a database?

It's not a database, but a library for handling databases, or as they put it: SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. The file based dialect is SQLite.

Which is better SQLite or SQLAlchemy?

Sqlite is a database storage engine, which can be better compared with things such as MySQL, PostgreSQL, Oracle, MSSQL, etc. It is used to store and retrieve structured data from files. SQLAlchemy is a Python library that provides an object relational mapper (ORM).


1 Answers

Use the following method of your query:

query.with_labels()

This will put unique names on every column, and there will be no ambiguity.

like image 138
kolypto Avatar answered Oct 12 '22 11:10

kolypto