When using pandas read_sql
to query my database using SQLAlchemy, I get the following warning:
SAWarning: Column 'id' on table being replaced by Column('id', Integer(), table=, primary_key=True, nullable=False), which has the same key. Consider use_labels for select() statements. (key, getattr(existing, 'table', None), value))
Right, so each of my League
, Season
, Round
, Match
, and Team
tables have a column named id
. Of course.
I ignored this at first, but this came to bite me in the buttocks when I wanted to delete one of the columns with the id
label/name, using pd.drop()
. Even pd.rename
referencing the column by its index (!) renamed all columns with the same name:
pandoc.rename(
columns={pandoc.columns[1]: 'match_id'},
inplace=True)
# This replaced all columns with the label `id` to `match_id`
SQLAlchemy advises that I use consider use_labels for select() statements, and while I managed to do with with regular queries, I can't really figure out where to stick .label('new_column_name')
in the following query:
pandoc = pd.read_sql(
Match.query.options(
joinedload(Match.home_team),
joinedload(Match.away_team)).statement,
db.session.bind,
parse_dates=['date_time'])
One option is to change all id
columns in my tables to tablename_id
, but that seems like an ugly workaround for a problem that should have a fairly simple solution.
Sample output from print(pandoc.head())
:
total_goals id round_id \
0 1.0 somestring here s12786-0
1 0.0 somestring here s12786-0
2 5.0 somestring here s12786-0
3 3.0 somestring here s12786-0
4 0.0 somestring here s12786-0
home_team_id away_team_id id id
0 667 664 667 664
1 669 691 669 691
2 672 677 672 677
3 707 686 707 686
4 699 703 699 703
Notice the 3 id
columns, one is the match ID, the other two are the home team ID and the away team ID.
SQLAlchemy is a library that facilitates the communication between Python programs and databases. Most of the times, this library is used as an Object Relational Mapper (ORM) tool that translates Python classes to tables on relational databases and automatically converts function calls to SQL statements.
SQLAlchemy provides a nice “Pythonic” way of interacting with databases. So rather than dealing with the differences between specific dialects of traditional SQL such as MySQL or PostgreSQL or Oracle, you can leverage the Pythonic framework of SQLAlchemy to streamline your workflow and more efficiently query your data.
It's not a database, but a library for handling databases, or as they put it: SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL. The file based dialect is SQLite.
Sqlite is a database storage engine, which can be better compared with things such as MySQL, PostgreSQL, Oracle, MSSQL, etc. It is used to store and retrieve structured data from files. SQLAlchemy is a Python library that provides an object relational mapper (ORM).
Use the following method of your query:
query.with_labels()
This will put unique names on every column, and there will be no ambiguity.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With