Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

psycopg2 - Inserting list of dictionaries into PosgreSQL database. Too many executions?

I am inserting a list of dictionaries to a PostgreSQL database. The list will be growing quickly and the number of dict values (columns) is around 30. The simplified data:

projects = [
{'name': 'project alpha', 'code': 12, 'active': True},
{'name': 'project beta', 'code': 25, 'active': True},
{'name': 'project charlie', 'code': 46, 'active': False}
]

Inserting the data into the PostgreSQL database with the following code does work (as in this answer), but I am worried about executing too many queries.

for project in projects:
    columns = project.keys()
    values = project.values()

    query = """INSERT INTO projects (%s) VALUES %s;"""

    # print(cursor.mogrify(query, (AsIs(','.join(project.keys())), tuple(project.values()))))

    cursor.execute(query, (AsIs(','.join(columns)), tuple(values)))

conn.commit()

Is there a better practice? Thank you so much in advance for your help!

like image 916
PythonSherpa Avatar asked Mar 01 '19 14:03

PythonSherpa


People also ask

What is a cursor Psycopg?

The Cursor class of the psycopg library provide methods to execute the PostgreSQL commands in the database using python code. Using the methods of it you can execute SQL statements, fetch data from the result sets, call procedures. You can create Cursor object using the cursor() method of the Connection object/class.

What is PostgreSQL Psycopg2?

From PostgreSQL wiki. Psycopg2 is a mature driver for interacting with PostgreSQL from the Python scripting language. It is written in C and provides a means to perform the full range of SQL operations against PostgreSQL databases.


3 Answers

Use execute_values() to insert hundreds of rows in a single query.

import psycopg2
from psycopg2.extras import execute_values

# ...

projects = [
{'name': 'project alpha', 'code': 12, 'active': True},
{'name': 'project beta', 'code': 25, 'active': True},
{'name': 'project charlie', 'code': 46, 'active': False}
]

columns = projects[0].keys()
query = "INSERT INTO projects ({}) VALUES %s".format(','.join(columns))

# convert projects values to sequence of seqeences
values = [[value for value in project.values()] for project in projects]

execute_values(cursor, query, values)
conn.commit()
like image 130
klin Avatar answered Oct 17 '22 18:10

klin


Another performant option that does not require so much data munging for a list of dictionaries is execute_batch (new in psycopg2 version 2.7).

For example:

import psycopg2
from psycopg2.extras import execute_batch

values = [{'name': 'project alpha', 'code': 12, 'active': True}, ...]
query = "INSERT INTO projects VALUES (%(name)s, %(code)s, %(active)s)"

execute_batch(cursor, query, values)
conn.commit()

https://www.psycopg.org/docs/extras.html#psycopg2.extras.execute_batch

like image 25
bensentropy Avatar answered Oct 17 '22 19:10

bensentropy


You can use bulk loading to make it faster.

https://www.citusdata.com/blog/2017/11/08/faster-bulk-loading-in-postgresql-with-copy/

like image 20
Johnny Chia Avatar answered Oct 17 '22 18:10

Johnny Chia