I have trouble querying a table, created with sqlalchemy on postgres db (local).
While I am able to execute, and receive query result with:
SELECT * FROM olympic_games
I am getting an error message when I'm trying to access single column, or perform any other operation on table:
SELECT games FROM olympic_games
The error message is (couple sentences translated from Polish):
ProgrammingError: (psycopg2.errors.UndefinedColumn) BŁĄD: column "games" does not exist
LINE 1: SELECT COUNT(Sport)
^
HINT: maybe you meant "olympic_games.Games".SQL: SELECT games FROM olympic_games LIMIT 5;]
(Background on this error at: http://sqlalche.me/e/f405)
It pretty much sums to that program doesn't see, or can access specific column, and display that it doesn't exist.
I tried accessing with table.column format, it didn't work as well. I am also able to see column names, via information_schema.columns
Data (.csv) was loaded with pd.read_csv, and then DataFrame.to_sql. Code below, thanks for help!
engine = create_engine('postgresql://:@:/olympic_games')
with open('olympic_athletes_2016_14.csv', 'r') as file:
df = pd.read_csv(file, index_col='ID')
df.to_sql(name = 'olympic_games', con = engine, if_exists = 'replace', index_label = 'ID')
Both execute commands returned with same error:
with engine.connect() as con:
rs = con.execute("SELECT games FROM olympic_games LIMIT 5;")
df_fetch = pd.DataFrame(rs.fetchall())
df_fetch2 = engine.execute("""SELECT games FROM olympic_games LIMIT 5;""").fetchall()
Essentially, this is the double quoting issue of column identifiers as mentioned in the PostgreSQL manual:
Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other.
When any of your Pandas data frame columns have mixed cases, the DataFrame.to_sql preserves the case sensitivity by creating columns with double quotes at CREATE TABLE stage. Specifically, the below Python Pandas code when using replace
df.to_sql(name='olympic_games', con=engine, if_exists='replace', index_label='ID')
Translates as below in Postgres if Sport was a titled case column in data frame:
DROP TABLE IF EXISTS public."olympic_games";
CREATE TABLE public."olympic_games"
(
...
"Sport" varchar(255)
"Games" varchar(255)
...
);
Once an identifier is quoted with mixed cases, it must always be referred to in that manner. Therefore sport is not the same as "Sport". Remember in SQL, double quotes actually is different than single quotes which can be interchangeable in Python.
To fix, consider rendering all your Pandas columns to lower case since "games" is the same as games, Games or GAMES (but not "Games" or "GAMES").
df.columns = df.columns.str.lower()
df.to_sql(name='olympic_games', con=engine, if_exists='replace', index_label='ID')
Alternatively, leave as is and quote appropriately:
SELECT "Games" FROM olympic_games
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With