Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_sql() inserting index

I am using Pandas 0.18.1, and while fiddling with this code,

import pd

def getIndividualDf(item):
    var1 = []
    # ... populate this list of numbers
    var2 = []
    # ... populate this other list of numbers

    newDf = pd.DataFrame({'var1': var1, 'var2': var2})
    newDf['extra_column'] = someIntScalar
    yield newDf

dfs = []
for item in someList:
    dfs.append(getIndividualDf(item))

resultDf = pd.concat(dfs)
resultDf['segment'] = segmentId # this is an integer scalar

from sqlalchemy import create_engine
engine = create_engine('postgresql://'+user+':'+password+'@'+host+'/'+dbname)
resultDf.reset_index().to_sql('table_name', engine, schema="schema_name", if_exists="append", index=False)

I was getting this exception:

(psycopg2.ProgrammingError) column "index" of relation "table_name" does not exist

Indeed, there is no such column in the table, only because there is no such explicit column in the data frame. Which is why it's weird.

Running

print(list(resultDf))

just before the to_sql() call, yields

['var1', 'var2', 'extra_column', 'segment']

Removing index=False from the to_sql() call changes the error to this:

(psycopg2.ProgrammingError) column "level_0" of relation "table_name" does not exist

I am puzzled. How do I get rid of index column?

Update
print(resultDf.head()) yielded this information:

     var1       var2  extra_column  segment
0       8   0.101653    2077869737   201606
1       9   0.303694    2077869737   201606
2      10   0.493210    2077869737   201606
3      11   0.661064    2077869737   201606
4      12   0.820924    2077869737   201606
like image 959
Alex Avatar asked May 12 '17 16:05

Alex


People also ask

How do I create a custom index in pandas?

To set the DataFrame index using existing columns or arrays in Pandas, use the set_index() method. The set_index() function sets the DataFrame index using existing columns. The index can replace the existing index or expand on it.

What is DF To_sql?

DataFrame - to_sql() function. The to_sql() function is used to write records stored in a DataFrame to a SQL database. Syntax: DataFrame.to_sql(self, name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)

How do you append an index?

Example #1: Use Index. append() function to append a single index to the given index. Output : Let's append df2 index at the end of df1.


1 Answers

You need not to reset the index before writing to sql such has:

resultDf.to_sql('table_name', engine, schema="schema_name", if_exists="append", index=False)
like image 161
Steven G Avatar answered Sep 22 '22 08:09

Steven G