Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write fast pandas dataframe to postgres

I wonder of the fastest way to write data from pandas DataFrame to table in postges DB.

1) I've tried pandas.to_sql, but for some reason it takes entity to copy data,

2) besides I've tried following:

import io
f = io.StringIO()
pd.DataFrame({'a':[1,2], 'b':[3,4]}).to_csv(f)
cursor = conn.cursor()
cursor.execute('create table bbbb (a int, b int);COMMIT; ')
cursor.copy_from(f, 'bbbb', columns=('a', 'b'), sep=',')
cursor.execute("select * from bbbb;")
a = cursor.fetchall()
print(a)
cursor.close()

but it returns empty list [].

So I have two questions: what is the fastest way to copy data from python code (dataframe) to postgres DB? and what was incorrect in the second approach that I've tried?

like image 343
Marcel Mars Avatar asked Jan 05 '23 16:01

Marcel Mars


1 Answers

Your second approach should be very fast.

There are two problems with your code:

  1. After writing the csv to f you are positioned at the end of the file. You need to put your position back to the beginning before starting to read.
  2. When writing a csv, you need to omit the header and index

Here is what your final code should look like:

import io
f = io.StringIO()
pd.DataFrame({'a':[1,2], 'b':[3,4]}).to_csv(f, index=False, header=False)  # removed header
f.seek(0)  # move position to beginning of file before reading
cursor = conn.cursor()
cursor.execute('create table bbbb (a int, b int);COMMIT; ')
cursor.copy_from(f, 'bbbb', columns=('a', 'b'), sep=',')
cursor.execute("select * from bbbb;")
a = cursor.fetchall()
print(a)
cursor.close()
like image 190
Michael Avatar answered Jan 07 '23 06:01

Michael