I am using Pandas 0.18.1, and while fiddling with this code,
import pd
def getIndividualDf(item):
var1 = []
# ... populate this list of numbers
var2 = []
# ... populate this other list of numbers
newDf = pd.DataFrame({'var1': var1, 'var2': var2})
newDf['extra_column'] = someIntScalar
yield newDf
dfs = []
for item in someList:
dfs.append(getIndividualDf(item))
resultDf = pd.concat(dfs)
resultDf['segment'] = segmentId # this is an integer scalar
from sqlalchemy import create_engine
engine = create_engine('postgresql://'+user+':'+password+'@'+host+'/'+dbname)
resultDf.reset_index().to_sql('table_name', engine, schema="schema_name", if_exists="append", index=False)
I was getting this exception:
(psycopg2.ProgrammingError) column "index" of relation "table_name" does not exist
Indeed, there is no such column in the table, only because there is no such explicit column in the data frame. Which is why it's weird.
Running
print(list(resultDf))
just before the to_sql()
call, yields
['var1', 'var2', 'extra_column', 'segment']
Removing index=False
from the to_sql()
call changes the error to this:
(psycopg2.ProgrammingError) column "level_0" of relation "table_name" does not exist
I am puzzled. How do I get rid of index
column?
Updateprint(resultDf.head())
yielded this information:
var1 var2 extra_column segment
0 8 0.101653 2077869737 201606
1 9 0.303694 2077869737 201606
2 10 0.493210 2077869737 201606
3 11 0.661064 2077869737 201606
4 12 0.820924 2077869737 201606
To set the DataFrame index using existing columns or arrays in Pandas, use the set_index() method. The set_index() function sets the DataFrame index using existing columns. The index can replace the existing index or expand on it.
DataFrame - to_sql() function. The to_sql() function is used to write records stored in a DataFrame to a SQL database. Syntax: DataFrame.to_sql(self, name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)
Example #1: Use Index. append() function to append a single index to the given index. Output : Let's append df2 index at the end of df1.
You need not to reset the index before writing to sql such has:
resultDf.to_sql('table_name', engine, schema="schema_name", if_exists="append", index=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With